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TECHWTr^ Tf PTFT| p 

5 This invention relates i-u 

for commercial „ d £ -f ^lene 

equ.l.n. leV els ia plants in J e " h " h "" UM1 

nucleotide sequences th,.,- „ >. "creased, and to 
» «o cause the d«ire n ^T"^ ^ ^ 
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BACKUP nrTxrp 

There is a US$ 125 mm,- 

^alene. a colourles 01 "el 

» health industries ,Kai y , 19so > , C ° < "" etiCS aM 

obtains „ a inly £ „„ sh y a ; fc ^ ^-l.ne is currents 

—11 entities in vegetable oUs s " ° """^ 
from shark liver i. ,».„, • ■ Scalene extracted 

,. declining in supply (Kaiv, ,<,„„, 

the harvesting of sharks for this „ ' M 

» environmentally unfriendly La s b anyU " y 
acceptable as environmental" Je^ 9 ^ 

scalene can be extractecT """" in ' oei ">'- 
the amounts are not suffice' t * 
cos„etics market (BondioU et al 1^ " "~ 
" »»3). Soualene c„,.,„ .. ." ^ a ° nd *°li « .1. 



»«> ■ Soualene could be extracted % " "' 

oils, but the levels of ,„ ! ! °° ° ther stable 

t=o low for this to \l hydr ° C « b °" i» the oil , re 

P~sent no Can al „ ^ ™~ « « 
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Process (Bondioli et .1.. „„>. Typically, squalene is 
concentrated .ore than one hundred fold in the deodorizer 
d.st.llate relative to the levels in unrefined vegetable 

5 Lstin T C °r rCial Viabilit ^< stable oil deodorizer 
* dastxllates should contain at least 5% (w/w) 8gualene 

currently, soybean and canola deodorizer distillates 

contain squalene in the 0.1-3% range (Ramamurthi s 

1994). Consequently, an increase of two-fold or more in 

the squalene content of these oilseeds could result in 

.0 commercially viable squalene production from vegetable 



It has been shown that in plant cell cultures 
squalene accumulates in the presence of squalene 
epoxidase inhibitors, e.g. allylamines such as 
.5 terbinafine (Yates et al. 1991). Apparently, much of the 
squalene produced in plants is converted to the epoxide 
by squalene epoxidase, and ultimately to plant sterols 
In fact, all plant and higher life forms contain squalene 
and squalene epoxidase genes, but little squalene 
20 accumulates in the tissues of such life forms because of 
the effects of the expressed squalene epoxidase 
Therefore, inhibition of the epoxidase gives squalene an 
opportunity to accumulate. However, there are as yet no 
commercial processes based on this concept. 
25 A main problem addressed by the inventors of the 

present invention is therefore to create a plant crop 
particularly an oilseed crop, which accumulates squalene 
in harvestable tissues, such as seeds, at suff icient 
levels for commercially- viable extraction. 

An object of the present invention is to provide new 
sources of squalene that have the potential to be 
exploited on a commercial basis to replace conventional 
commercial sources of squalene. 
35 Another object of the present invention, is to 

generate squalene-producing plants modified to accumulate 
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squalene in the plant tissue (e.g. in seeds) in 
sufficient quantities to make the extraction of squalene 
commercially attractive. 

Another object of the invention is to identify 
5 squalene epoxidase genes in plants, and to partially or 
completely neutralise the expression of such genes. 

Another object of the invention is to produce DNA 
clones, constructs and vectors suitable for modifying the 
genomes of plants to reduce expression of squalene 
10 epoxidase . 

Yet another object of the invention is to provide a 
commercial process for producing squalene from plant 
tissue, especially seeds. 

The inventors of the present invention have 
15 discovered the DNA sequences of the genes encoding 
squalene epoxidase (squalene monooxygenase (2,3- 
epoxidizing) ; EC 1.14.99.7) from the plants J^rahidopsis 
thaliana (thale cress), and Brassica napue (rapeseed, 
canola) , as well as a second gene from Araubidopsis and 
20 one from Ricinua communis (castor;, and using this 
knowledge have developed a process of modifying the 
genomes of such plants to produce genetically-modified 
plants which accumulate squalene at higher than natural 
levels. Moreover, the process may be operated to 
25 increase squalene levels in plants using DNA based on 
squalene epoxidase genes from different but related 
plants. 

According to one aspect of the invention, there is 
provided an isolated and cloned DNA (polynucleotide) 
30 suitable for introduction into a genome of a plant to 
suppress expression of squalene epoxidase by said plant 
below natural levels, wherein the DNA has a sequence 
corresponding at least in part to a squalene epoxidase 
gene of a plant. 

35 The DNA preferably has a sequence corresponding to 

all or part of a specific sequence selected from SEQ ID 
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NO:l, SEQ ID NO: 3, SEQ ID NO : 5 , SEQ ID NO : 9 and SEQ ID 
NO:10 (as shown in the following Sequence Listing); or 
having at least 60% (more preferably at least 70V) 
homology thereto. 
5 The measure of homology between two DNA 

(polynucleotide) sequences as used in this specification 
is the similarity index given by application of the 
Wilbur-Lipman algorithm of the MEGALIGN® computer 

program (DNASTAR) in aligning and comparing DNA sequences 
10 corresponding to a complete polypeptide coding region 
using the parameters ktuple»3 , gap penalty=3 and 
windows 20 . 

According to another aspect of the invention, there 
is provided a process of producing genetically-modified 
15 plants having increased levels of squalene in tissues of 
the plants compared to corresponding wild- type plants, 
wherein the plant genome is modified to suppress 
expression of squalene expoxidase by said plant. The 
genome is modified by introducing at least one exogenous 
20 DNA sequence that corresponds, at least in part, to one 
or more endogenous squalene epoxidase genes of the plant . 

The DNA sequence introduced into said plant genome 
has at least 60%, and more preferably at least 70%, 
homology to said one or more of the endogenous squalene 
25 epoxidase genes, and is preferably all or part of a 

sequence selected from SEQ ID NO:l, SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO : 9 and SEQ ID NO: 10. 

According to yet another aspect of the invention, at 
least in a preferred form, there is provided a process of 
30 producing genetically-modified plants having increased 
levels of squalene in tissues of the plants compared to 
corresponding wild- type plants, wherein the plant genome 
is modified to suppress expression of squalene expoxidase 
by said plant, raising squalene levels of a plant, by 
35 introducing into the genome of the plant a nucleotide 
sequence that reduces or prevents expression of squalene 
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epoxidase. The DNA introduced into the genome includes a 
transcriptional promoter and a sequence that when 
transcribed from the promoter is complementary or 
antisense to all or part of at least one squalene 
5 epoxidase messenger RNA produced by the plant 

The invention also relates to plasmids and vectors 
used in the processes indicated above, and as disclosed 
later. 

The invention further relates to a genetically- 
.0 modified plant capable of accumulating squalene at levels 
higher than the corresponding wild-type plant, produced 
by a process as indicated above, or a seed of such a 
plant . 

The invention additionally relates to a process of 
15 producing squalene. which involves growing a genetically- 
modified plant as defined above, harvesting the plant or 
aeeds of the plant, and extracting squalene from the 
harvested plant or seeds. 
BRIEF nffiprflTpji^ op ^ QRAWlMnfi 

Figure 1 shows the alignment of deduced amino acid 
sequences of the clones pDRin (B . napus m) [gEQ ^ 

NOM], PDR4H (B. napus 411) [SEQ ID NO:ll] and 129F12T7 
(Arabidopsis) [ SEQ ID N0:2]/ and of tne ^ squalene 
epoxidase genes of mouse (DNA Database of Japan D42048) 
[SEQ ID NO:*], rat (DNA Database of Japan D37920) [SEQ ID 
NO:7], and baker's yeast (Genbank M64994) [SEQ ID NO:8]- 
the alignment was done using the MEGALIGN™ program of 
the LASERGENE™ suite of programs ( DNASTAR ) using a 
multiple alignment gap penalty of 20; and 

(nsr^r™ 2 ' 3 ^ 4 Pla8mid ^ ° f three actors 

(PSB111A. PSE411A and p SE 129A, respectively, produced 

according to one embodiment of the present invention 

mwrp F0R ^brytng out thf T r TrrrTpri 

general, Diflc»« ? io n 



20 



30 



35 



The concept underlying the present invention is to 
identify squalene epoxidase genes of oilseed plants (or 
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Possibly other pl» tS , since all plants appe „ to 

thoT r °" Pr ° dUCti0 " ° f SqUaleM - Particularly 
hose plants that „. cap(lble o{ aooumulat ^ 

5 to cL^r" ! e tissue) and th " <° - 

to create wn.tic.Uy-.odif ied plant . ln whlch 

that soualene naturally ^ ^ ■ ~ 

accumulate in the seeds or other tissue * n . i v 
■ /. i " Ci t-issue to levels thah 

•0 make extraction commercially attractive 

The approach taken by the inventors of the present 
xnvent.cn to identify scalene epoxidase genes of pl^ s 

-P-^ase gene from yeast to identify equivalent gel 
15 suxtable plant species, e.g. by heterologous 

hybrxdization, on the assumption that all squalene 
epoxidase genes will have a considerable degree of 
sxmxlarity. Once one or several plant squalene epoxidase 
3 enes have been identified in this way. those plant g^nes 
20 can then be used to identify additional squalene 
epoxxdase genes from other plants. 



Heterologous Hybridizat 



ion 



Nucleic acid hybridization is a technique used to 
identify specific nucleic acids fron, a „i«ure. Southern 
-lysis a s , type of nucleic acid hybridation in „hS 

* " 'n™ 11 * di *">"0 -tth restriction enzymes, 
separated by gel electrophoresis and bound to T 

°„iT eUUl05e " nVl0n * ~— «=« Probe 

"a ly"ett: C ^ ly ^^"^ « «"-ise rendered 
easily detectable. 16 hybridized to the bound DNA by 
exposing xt to the Mna>r . n .. boulld ^ / 

.».y. The location of the bound probe is then detected 
by autoradiography or other detection method. The 
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location of the bound probe is an indication that DNA 
sequences that are similar to those in the probe nucleic 
acid are present. Hybridization may also be done with 
DNA of clones of a recombinant DNA library, such as a 
5 cDNA library, when that DNA has been bound to a membrane 
after plating the library out (Ausubel et al . , 1994). 
Of course, the method used by the inventors to identify 
the genes disclosed in the present application may be 
used to identify equivalent genes from other plants. As 
10 noted above, the process originally used by the inventors 
to identify the Arabidopsis gene was based on further 
analysis of a gene that was tentatively identified from a 
publicly available database containing partial sequences 
(Expressed Sequence Tags or EST's) submitted by other 
15 workers from randomly chosen (unidentified) gene clones. 
EST's from other species (such as rice, castor) can also 
be searched in the same way to find other possible 
squalene epoxidase genes present in such plants 
(depending on the more or less accidental sequencing of 
20 the desired genes) using the Arabidopsis and B. napus 
sequences disclosed herein. 

The inventors have, for example, found other EST's 
from plants that have tentatively been identified as 
squalene epoxidase genes by comparing them to the 
25 Ababidopsis and B. napus sequences discussed above. 
Thus, sequences corresponding to Genbank Accession 
Numbers T15019 (obtainable from Dr. C.R. Somerville 
Carnegie Institution, 290 Panama St., Stanford, CA 94305 
USA) and W433S3 (obtainable from DNA Stock Center 
30 Arabidopsis Biological Resource Center, Ohio State 
University, 1060 Carmack Road, Columbus, OH 43210-1002 
USA) have been predicted to correspond to squalene 
epoxidases genes from Ricinus communis (castor) and 
ArabidopBis (a second Arabidopsis gene) . 
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Perhaps more importantly, the process by which the 
B. napus gene was cloned can be used to clone other plant 
species. The (heterologous hybridization) methods are 
well known, but the process requires the knowledge and 
5 use of the novel plant squalene epoxidase sequences 
disclosed in this application. 

If the hybridization and washing are done under 
conditions which are considered stringent (e.g., at 
relatively high temperature and/or low salt and/or high 
10 f ormamide concentration) . then the sequences detected 
generally have a high degree of similarity to the probe 
nucleic acid. If hybridization and washing are done at 
lower stringency, then it is possible to detect sequences 
that are lower in similarity to the probe. Discussions 
15 of this detection of similar sequences by hybridization 
can be found in Beltz et al . (1983) and Yaraamoto and 
Kadowaki (1995). From the point of view of gene cloning, 
if one obtains a clone for a gene in one organism, one 
can use low stringency hybridization of the DNA clones 
20 corresponding to a related organism to detect the 

homologous gene sequences of that organism. As mentioned 
before, the success of this approach depends on the 
similarity of the sequences of the homologous genes which 
in turn generally depends on the evolutionary 
25 relationship between the organisms. 

Once identified, sequenced and cloned, the DNA of 
suitable plant species may then be modified or 
manipulated with any technique capable of decreasing the 
expression of a natural gene based on an isolated DNA 
30 clone corresponding, at least in part, to that gene. 
Suitable methods, at present, include antisense 
technologies (Bourque, 1995), co-suppression or gene 
silencing technologies (Meyer, 1995; stam et al . , 1997. 
Matzke and Matzke, 1995), and ribozyme technologies 
35 (Wegener et al . 1994; Barinaga, 1993). 
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These technologies are discussed in more detail 



Down-regulation of Gene expression 
5 General 

The activity of a particular enzyme, such as 
squalene epoxidase, is dependent on. among other things 
(such as the biochemical environment) , the amount of 
10 enzyme (usually, and for the sake of this argument, a 
protein) that is present. The amount of enzyme present 
depends on the expression of the gene or genes encoding 
the enzyme of interest. Gene expression usually includes 
(not necessarily in this order) transcription of DNA to 
15 generate RNA, processing of the RNA produced from 
transcription, transport of RNA to the site of 
translation, translation of mature messenger RNA into 
polypeptide, proteolytic processing and folding of the 
nascent polypeptide, transport of the protein product to 
20 various cellular compartments, and post-translational 
modification of the protein (such as phosphorylation or 
glycosylation) . Any effect or difference in any of the 
processes involved in gene expression can have an effect 
on the level of expression of an enzyme encoded by a 
25 gxven gene or genes . Gene expression often varies with 
cell type, tissue type and developmental stage. 
Likewise, enzyme levels in different cells and tissues 
and at different developmental stages varies widely (For 
Plant nuclear genes, this is often the result of 
30 differential transcription.) 

Gene expression can also be affected by the 
breakdown of the gene product, the enzyme, or any of the 
intermediates in gene expression, such as precursor RNA 
From a genetic engineering point of view, in 
35 principle, gene expression can be down -regulated by 
affecting almost any of the processes involved. For 
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-ample, although the mechanism is not well established 
tl8 ; nS ; **^°*Y Us discussed below, decreases the' 
amount of translatable messenger RNA (mRNA) in an 
organism. 

5 

A) Antisense technology 

An appropriate antisense technology is disclosed 

•o IZ.TZ in us patent 5 ' 190 ' 931 issued - M *~« 2. 

•0 1993 to Maaayori Inouye. The disclosure of this patent 
xs incorporated herein by reference. ln short, this 
technology can be used to regulate or inhibit gene 
expression in a cell by incorporating into the genetic 
material of the cell a nucleic acid sequence which is 

an7 SCr ^ d ^ Pr0dU " ^ m WhiCh " — Pl-entary to 
and capable of binding to the mRNA produced by the 

genetic material of the cell . The introduced nucleic 
acxd sequences include equivalents of the gene to be 
regulated, or parts thereof, oriented in antisense 
fashion relative to a transcriptional promoter. Thus 
the squalene epoxidase sequence, or part thereof is ' 
introduced into the genetic material of the cell 'as a 
construct positioned between a transcriptional promoter 

25 mZ ent ^ nd * tranSCriptional termination segment. The 
25 mRNA produced when the antisense sequences are 

transcribed binds or hybridizes to the mRNA f rom the 
squalene epoxidase gene of interest and prevents 
translation to a corresponding protein. Therefore the 
protein coded for by the gene is not produced, or ie 
30 produced in smaller quantities than would otherwise be 
the case. By introducing a gene that has a sequence that 

antasense to the natural squalene epoxidase gene in 
oilseed plants, the epoxidation of squalene can be 
inhabited or reduced so that squalene accumulates in the 
Plant tissues, especially the seeds, which can then be 
harvested in the usual way and the squalene extracted 
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uaing conventional techniques. 

In terms of the process of antisense down -regulation 
of squalene epoxidase genes, for any plant species, it is 
generally necessary to use a gene from a closely related 
5 plant such that the genes are more than about 60%, and 
preferably about 70%, identical at the DNA level (Murphy, 
1996) . Thus, homologous (equivalent) genes from the same 
family of plants, would reasonably be expected to give an 
antisense effect on any member species of that family 
10 For example, Arabidopsis genes have been found to have 
antisense effects in B. napus (Murphy, i 996 ) . 

The antisense DNA in expressible form may be 
introduced into plant cells by any suitable 
transformation technique, e.g. in planta transformation 
15 (such as wound inoculation or vacuum infiltration) . 

Transformation may also be carried out by co-cultivation 
of cotyledonary petioles and hypocotyl explants (e.g. of 
B. napus and S. carinata) with A. tumefaciens bearing 

suitable constructs (Moloney et al . (1989) and DeBlock et 
20 al. (1989) ) . 

It would, of course, be optimal to identify a 
natural squalene epoxidase gene for each plant species to 
be modified in order to ensure complete correspondence of 
the DNA used to modify the natural gene and the DNA of 

25 the natural gene itself. if a gen e from one plant 

species has been cloned, there are methods available to 
clone the same gene from other plants. The reliability 
of these methods (heterologous hybridization methods) 
depends on the similarity of the DNA sequence of the 

30 genes. if the DNA sequences have at least 60% of their 
sequence identical, and more preferably at least 70%, 
then the methods are usually reliable. Sequence 
similarity depends mostly on evolutionary (ancestral) 
relationships between plants. Practically, this means 
35 that either of the two genes f irst c i oned by the 



WO 97/34003 PCI7CA97/00175 

-12- 

inventors (the Arahidopsis and B. napus genes) may be 

used to clone the same gene in any other dicotyledonous 
plant (dicot) , including, but not limited to soybean, 
tobacco, amaranth, potato, cotton, flax, bean, and pea. 
5 It is also reasonable to assume that the Arabidopsia or 

B. napua genes could also be used to clone the same genes 

from monocotyledonous plants (monocots) , such as wheat, 
com and barley. 

The antisense effect occurs when hybridization can 

10 occur between antisense RNA and native RNA under the 
conditions prevailing in the cell. This may occur when 
the antisense RNA (and corresponding cDNA) contains as 
few as 2 0 nucleotides. More preferably, however, there 
should be at least 100 nucleotides in the cDNA to 

15 guarantee the required effect, and of course any larger 
portion up to the entire cDNA may be employed. In short, 
therefore, for effective antisense technology, the DNA 
sequence introduced into the plant genome should 
preferably be at least 20 consecutive nucleotides 

20 corresponding the native squalene epoxidase gene, and 
more preferably between 100 and the full DNA sequence of 
the gene . The homology of the added sequence may be at 
least 60k , and more preferably at least 70%, of the 
native plant gene. 

25 

B) Ribozyme Technology 

Another method for downregulat ing gene expression by 
affecting mRNA levels is ribozyme technology. Ribozymes 

30 are RNA molecules capable of catalyzing the cleavage of 
RNA and other nucleic acids. In nature, Tetrahymena 
preribosomal RNA, some viroids, virusoids and satellites 
RNAs of plant viruses perform self -cleavage reactions. 
The cleavage site for some plant pathogenic RNAs consists 

35 of a consensus structure, called the "hammmerhead" motif. 
The cleavage occurs within this hammerhead 3 ' to a GUX 
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region directing the catalyeig J^ZllT^ 
can be separated from the re gion where Z T™^ 
occurs and the recognition of the t.r t cle *vage 
5 modified by chan ^ nn target ^ can be 

y cna "ging the nucleotide se™,»„„ 

regions flanking the cleavage site IT 
ribozymes can be designed to catalvze i C ° nSegUence ' 
on targeted sequences of separate I™** 
provides a means of regulat^T Urates. This 

■0 DNA seance of the gen" 18 ^ eXPreSSi ° n ' if ^ 
In order to genetically engineer th* „ 
of a particular gene in plants/^ veL, ^"^^ 
constructed for transformation that IncL 
units, each of which .ay include a t ^ " 

15 promoter and a a*™,. lnclude a transcriptional 

ana a sequence encodina a r iK n . 

Wegener et al. i 994) in w . . ' Ceinecke « al . i 992 , 
5n • whxch a ribozyme was h.„< 

20 agaxnst neomycin phosphotransferase Zl ^l^^ 

constructs encoding the ribo 2yme and^' ^ 

Phosphotransferase („ b m neomycin 

"erase (npt) gene were used i-o «- 
Plants. in ni ailho „„.... to transform 

a observed relative to ri f™ transf " a " activity „ as 

3 «e construe ' «i*h o„ ly che „ pt 

Ribozyme technology also » n ™ 

other eukaryotes, such^s the f^It 8UCCeSSfUl ln 

2993). 6 fruit fl y (Zhao and Pick, 

30 

- -suppression or Ho^-oependent ^ 

» "~ui„ g tr ans3eni / pl „ ; ::rL;n v raction ° £ the 

level* of expression of both the IT ' loU 

tRe na tive gene end the 
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introduced gene (transgene) . This phenomenon has been 
called co-suppression or homology -dependent gene 
silencing (Stam et al . 1996, Matzke and Matzke 1995). 
The mechanism by which co-suppression occurs is very 
5 poorly understood. However, advantage can be taken of 
the phenomenon to down-regulate the expression of a gene 
of interest. This can be accomplished by transforming a 
plant with a DNA construct which contains a strong 
transcriptional promoter driving the sense transcription 

10 of a DNA sequence with high similarity to the gene of 
interest. For example, when the chalcone synthase gene 
was introduced into petunia in an attempt to overproduce 
chalcone synthase (which is involved in flower pigment 
biosynthesis) , some transgenic plants showed pigment 

15 patterns and enzyme levels that indicated the suppression 
of chalcone synthase gene expression (Jorgensen 1990) . 
Investigation of examples such as these has shown that 
the effect is often associated with repetition of the 
transgene inserts in the plant genome. Cosuppression may 

20 be dependent on the coding region of a gene or on the 
promoter and other non-coding regions. 

Thus, the down -regulation of squalene epoxidase in 
plants may be engineered with the use of cDNA sequence 
that are disclosed herein, or with plant genomic 

25 sequences which may include the promoter or promoters of 
squalene epoxidase genes . 

D) Other variations 

30 Variations on the process of increasing squalene in 

plants include the use of different promoter sequences 
which may give rise to increased squalene in other 
tissues and at various stages of development . For 
example, the use of the cauliflower mosaic virus 35S 

35 promoter is likely to have an effect in most plant 
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tissues. Other seed-specific and tissue-specific 
promoter may also be used. 

Also, other plant transformation methods may be used 
such as the particle gun technique (Christou 1993) . 
5 As well, other vectors, selectable markers, 

transcription terminators, etc., may be used (Guerineau 
and Mullineaux 1993) . 

It has already been observed that overexpression of 
a fragment of the hamster 3-hydroxymethyl-3-glutaryl CoA 

10 reductase (HMGR) gene in plants can elevate squalene 
levels in plants (Chappell et al . 1994) . This is likely 
due to the fact that the level of HMGR limits the flow of 
carbon through the mevalonate/sterol pathway that 
includes squalene. It would be expected that a 

15 combination of elevated HMGR levels and down- regulated 
squalene epoxidase levels would have an effect on raising 
squalene levels that would be larger than the effect of 
either elevated HMGR alone or down -regulated squalene 
epoxidase alone. 

20 

Experimenta l Detail 

IDENTIFICAT ION OF THE SOTiaT.BWB BPOXTHMB r.ffl ^ 

The DMA sequence of the squalene epoxidase gene of 
25 yeast was published by Jandrositz et al . (1991) Using 
the TBLASTN™ computer search program (Altschul et al . 
1990) and the yeast squalene epoxidase (predicted) amino 
acid sequence, the sequence was used to search a database 
which included partial cDNA sequences called "the Non- 
30 Redundant database" maintained by the National Center for 
Biotechnology Information (NCBI ) in the United States 
This database is a non - redundant nucleotide database made 

up of : 

35 ^ Brookhaven Protein Data Bank, Ap ril 19S4 Rele „ e 

genbank Genbank Rele.., B7.0. February 15, 1995 
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gbupdate Genbank® cumulative updates to genbank major release 

embl EMBL data library. Release 41.0, December 1994 

emblu E MBL Data Library, cumulative updates to embl major release 



5 maintained by the National Center for Biotechnology 
Information (NCBI) , National Library of Medicine, 
National Institute of Health, Bethesda, MD 20894, 
U.S.A. ) . 

The database included expressed sequence tags 
10 (ESTs) , i.e. partial sequences of more-or-less randomly 
chosen cDNA clones. This search identified the 
Arabidopais thallana cDNA clone 129F12T7 (Genbank 

accession no. T44667) as a putative squalene epoxidase 
gene. This clone was the seventh highest scoring 

15 sequence in this search and the highest scoring plant 
sequence. The P(N) of 1.9 x 1(T 5 was considered 
borderline significant. The single high- scoring pair 
(HSP) of subsequences found was a stretch of 46 
nucleotides with 21 positions identical (45%) . Searches 

20 with the T44667 sequence revealed that a large portion of 
the 4 6 nucleotide region (29 nucleotides) matches a 
sequence motif found in a variety of enzymes that bound 
adenine dinucleotides , such as flavin adenine 
dinucleotide (FAD; which at least some squalene 

25 epoxidases are known to use as a cof actor; see Wierenga 
et al . 1986) . So, in fact, the search, done when only 
the partial DNA sequence (T44667) was available, 
suggested the possibility , but did not confirm that 
T44667 corresponded to a squalene epoxidase gene. 

30 The 129F12T7 clone was obtained and its DNA 

sequenced completely by the inventors at the Plant 
Biotech Institute of the National Research Council of 
Canada at Saskatoon, Saskatchewan, Canada. The DNA 
sequence of the cDNA insert of pl29F12T7 is shown in the 

35 Sequence Listing (see later) as SEQ ID NO : 1. After the 
full sequence of the insert of pl29F12T7 was obtained, 
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the Non-Redundant Protein Database (NCBI) was searched 
using the BLAST™ software (Altschul et al . i 990 ) (NCBI ) 
based on the predicted amino acid sequence. The amino 
add sequence corresponding to the open reading frame of 
5 SEQ ID NO:l are shown in the Sequence Listing as SEQ ID 
N0.-2. The Arabidopsis sequence gave the highest scoring 
matches with squalene epoxidase sequences including that 
of rat ( P{N ),5 x 10'" ) and yeast <P( N ), 9 . 2 x 10 -« , Nq 
sequences which had been reliably identified had P (N) 
.0 values less than 10- . These numbers indicate that the 
product of the Arabidopsis gene is, in all probability, 
egualene epoxidaae. 

The 129F12T7 clone was used to probe a B . napus cDNA 
library, obtained from Dr. Edward Tsang of the Plant 
'5 Biotech institute. Two independent clones, pDRm and 
PDR411 were isolated and sequenced. The Sequence Listing 
shows the DNA sequences of the cDNA inserts of pDRm 
[SEQ ID N0:3] and pDR41l [SEQ ID NO:5] and the amino acid 
sequences corresponding to the coding regions of SEQ ID 
20NO:3 [SEQ ID NO:4] and SEQ ID NO:5 [SEQ ID NO-11] 

PDR111 and P DR4Xl have similar (but not identical) DNA 
sequences which are also similar to the 129F12T7 
sequence. Plasmids P 129F12T7, pD Rlii and p D R41l were 

25 T 2 7 1 i ?\? AmeriCan ^ CUltU " C °^-ion (ATCC) , 

25 12901 Parklawn Drive, Rocxville, Maryland 20852-1776 

USA under the terms of the Budapest Treaty on January 9, 
1997 and were accepted. The deposit numbers are 
respectively, ATCC 97847, ATC C 97846 and ATCC 97845 A 
single deposit receipt and statement of viability was 
30 issued for all three deposits on January 17, 1997 . 

Figure 1 of the accompanying drawings shows an 

TsTTlol 1 amino acid 8equences for the 129F12T7 

CSEQ ID NO:2], the pDRlii clone [SEQ ID NO:4] and the 
PDR411 [SEQ ID NO:ll] clone, along with the squalene 

id^oT SCqUenCeS amin ° acid -quences for mouse [SEQ 
«> N0.-6], rat [SEQ ID NO:7] and yeast [SEQ id NO: 6 ] The 
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plant sequence show blocks of high similarity to the 
non -plant sequences, including the region thought to 
correspond to an adenine dinucleotide-binding site 
(residues 4 5-88 of the Arabidopais sequence; Wierenga et 
5 al. 1986; Sakakibara et al . 1995), as well as in the 
C-terminal half of the sequence. The amino acid sequence 
similarities based on this alignment are shown in Table 1 
below. 
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ible l 

Amino acid sequence similarities 

calculated by MEG ALIGN™ software for the sequence 

alignment of Figure 1. 




Analysis of the pDR411 sequence suggests it has an 
10 xntron in the 3 ■ -end of its amino acid coding region 
which is, of course, unusual in cDNA . if nucleotides 
1473-1629 (inclusive) are removed from the sequence and 
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the cDNA translated, the C-terminus is more similar to 
the pDRlll and pl29F12T7 amino acid sequences [SEQ ID 
NO: 4 and SEQ ID NO: 2] . Also, there are sequence patterns 
in this region that are common to other plant introns (5' 

5 and 3 * splice consensus sequences and high AT content 
(Goodall and Filipowicz, 1991)) . This may mean that the 
pDR411 clone represents an intermediate or precursor RNA, 
rather than the final messenger RNA (mRNA) . There can 
therefore be less certainty in predicting the full amino 

10 acid sequence corresponding to pDR411, although this 
predicted sequence is shown in Fig. 1 [SEQ ID NO: 11] . 
However, the possible presence of a small intron in the 
3 ' -end of pDR411 does not cause a problem for its use in 
antisense techniques. 

15 Employing the plant squalene epoxidase sequences, 

transgenic plants can be generated which accumulate 
squalene in their seeds. This can be done by established 
genetic transf ormation methods using DNA constructs that 
include the napin or other seed- specif ic promoters 

20 (Kridl, 1988; Anonymous, 1995) and fragments of plant 
squalene epoxidase genes arranged in the antisense 
orientation. Downregulation of the squalene epoxidase 
gene in seeds by antisense technology {Inouye, 1990; 
Bourque, 1995) will prevent the conversion of squalene to 

25 squalene expoxide and result in squalene accumulation. 

ISOLATION OF SQUALENE EPOXIDASE GENE IN B . NAPUS 

The 129F12T7 clone obtained as described above was 
used to probe for the homologous gene in B. napus as 

follows . 

30 Unless otherwise noted all molecular biology methods 

were performed as described in Ausubel et al . (1994) . 

The AraJbidopsis 12 9F12T7 DNA Probe 

35 The plasmid pl29F12T7 was digested with the 

restriction enzymes Sal I and Not I. The resulting DNA 
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fragments were separated by agarose gel electrophoresis. 

The 1.8kb Sal I/Not I DNA fragment corresponding to the 
Arabidopsis squalene epoxidase cDNA was purified from a 
gel band. A radiolabeled DNA probe was prepared by the 
5 random priming method and [alpha-32P] -dCTP (deoxycytidine 
triphosphate) . 

Library Screening 

10 The probe produced as above was used to screen a B. 

aapus cDNA library, kindly provided by Dr. Edward Tsang 
of the Plant Biotechnology Institute (Saskatoon, 
Saskatchewan, Canada) . To construct the library, fl. 
napus seedlings (cv. Westar) were grown (on half strength 
15 Murashige and Skoog agar (1%) medium supplemented with 1% 
sucrose) in the dark at 22°C for two weeks after 
germination and exposed to light for 24 hours. PolyA+ 
RNA was extracted from the seedlings and first strand 
CDNA synthesis was primed with an oligo dT/Not I 
20 adapter/primer. Sal I adapters were ligated after second 
strand cDNA synthesis and a library was constructed in 
Not I/Sal I arms of the Lambda ZipLox vector (Life 
Technologies) . 

The library was plated using standard methods and 
25 the Y1090 strain of E. coli. Approximately 25,000 
plaques from the library were plated, lifted onto 
Hybond®-c nylon membranes (Amer sham) and hybridized with 
the above probe according to the manufacturer's 
instructions. After two rounds of plaque purification 
30 two independent clones, pDRlll and pDR4ll were isolated 
by in vivo excision. 

The pl29F12T7, pDRlll and pDR411 clones were 
sequenced using the PRISM® DyeDeoxy Terminator Cycle 
Sequencing System (Perkin Elmer/Applied Biosystems) and a 
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Model 373 DNA Sequencer (Applied Biosystems) . DNA 
sequences were assembled and analyzed using the 
Lasergene® suite of software (DNASTAR, Inc.) and BLAST* 
and related software of the NCBI. 

5 

CONSTRUCTION O F VECTORS FOR PLANT TRANSFORMATION 

Figs. 2, 3 and 4 show three vectors constructed for 
plant transformation, namely pSE129A, pSElllA and 
pSE4 11A. In these drawings, the following abbreviations 
10 are used: 



nosT 3' -terminus of the nopaline synthase gene 

SE129 Sal I/Not I insert of pl29F12T7 

SE111 Sal I/Xba I fragment of the insert of pDRlll 

15 SE411 Sal I/Not I insert of pDR411 

Napin P napin gene promoter ( Josef sson 1986) . 



All other elements are described by Guerineau and 
Mullineaux (1993), Thomas et al. (1992) and Beban (1984). 

20 

These plasmids were constructed as follows. 

pDHl 

The plasmid pE3 5SNT was obtained from Raju Datla 
25 (Plant Biotechnology Institute, Saskatoon, Saskatchewan 
Canada) . It contains a double 35S promoter and nopaline 
synthase (Nos) terminator (Datla, 1992) in pUC19. It was 
digested with Hind III and Xba I to remove the double 35S 
promoter. The napin promoter (Josef sson et al . 1987) was 
30 isolated from pNap (obtained from Ravi Jain, Plant 

Biotechnology Institute, Saskatoon, Saskatchewan, Canada) 
by Hind III and Xba I digestion. The plasmid pDHl was 
produced by ligation of the large pE35SNT/Hind III/Xba I 
fragment and the Hind III/Xba I napin promoter fragment. 
35 Thus, pDHl contained the napin promoter and the Nos 
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PSE411A 

The PDR4H plasmid was digeated 
The f ragment containi a *^ ed Wlth I and xba 1 

> was ligated to the la rge f SqUalenC CPOXidaSe 

I-digested pDH129A ve Ctor ^ I ^ °* ^ *" and 

sequence) m u ^rabidopsis cDNA 

upstream of the Nr, a - • P proraoter a "d 

m or the Nos terminator, ddhiiia w a «, 

EcoR x and partially digested with Hand Trr "'^ * ith 
fragment containing napin promotL ^ the 

cDNA and NoS t^^'lTr^ 5qU3lene eP ° XidaSe 
» -o R Wilted pRD400 ^ ^ "- Hi- III- and 
PSE411A. 1992) to give 



The final vectors PSE129A nwn -i » 
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The pSE129A construct was tested in A. thaliana by 
in planta transformation techniques. 

Wild type (WT) A. thaliana plants of ecotype 
Columbia were grown in soil. m planta transformation 
5 was performed by vacuum infiltration (Bechtold et al. 
1993) with overnight bacterial suspension of A. 
tumefaciens strain GV3101 bearing helper nopaline plasmid 
PMP90 (disarmed Ti plasmid with intact vir region acting 
in trans, gentamycin and kanamycin selection markers,- 
10 Koncz and Schell (1986)) and binary vector P SE129A. 

After infiltration, plants were grown to set seeds 
(T t generation) . Dry seeds (T 1 generation of seeds) were 
harvested in bulk and screened on selective medium with 
50 mg/L kanamycin. After two to three weeks on selective 
15 medium, surviving seedlings were transferred to soil. 
Mature seeds from these seedlings <T 2 seeds) were used 
for squalene analysis. Mature seeds from untransformed 
wild type (WT) Columbia plants and pRD400 transgenic 
Plants (binary vector pRD400, containing only kanamycin 
20 selection marker; Datla et al. 1992) were used as 
controls in analyses of seed lipids. 

Seed Analysis 

25 Seeds were analyzed for squalene levels as follows.- 

In all steps, care was taken to avoid contamination 
from external sources, particularly human skin. 5-10mg of 
Arabidopsis seeds were weighed and rinsed with hexane to 

30 remove any external contamination. 1 ml of 7.5% KOH (in 
95% methanol) was added to each sample and 250ng of 
squalane were added as internal standard. (Squalane is 
the hydrogenated form of squalene . ) Seeds were 
homogenized with a Polytron* (Model PRO200, pro 

35 Scientific) at maximum speed for 40 seconds. The head of 
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the Polytron was washed with 1 ml of 7.5V KOH (in 95V 
methanol) and the wash was pooled with the homogenate. 
The mixture was incubated at 8 0°C for 1 hr, then cooled 
to room temperature. The mixture was centrifuged at 3 000 
5 g for 5 min, and the supernatant was transferred to a 
fresh tube. One ml of H 2 0 and 1.5 ml of hexane were 
added to the supernatant and, after vortexing, the 
mixture was centrifuged at 3000 gr for 5 minutes. The 

hexane (top) layer was transferred to another test tube. 

10 The aqueous phase was re-extracted with 1 . 5 ml hexane and 
the hexane fractions were pooled. The hexane fraction 
was extracted with l ml of water/me thanol /KOH (50:50:2) 
and evaporated under nitrogen. The residue was dissolved 
in 50 ul of hexane and transferred to an autosampler 

15 vial. Gas-liquid chromatography was performed with a DB5 

column (J & W Scientific, USA) using the following 

parameters : 

Column Temperature : 0-1 min 180°C 

1-16 min 180-280°C 
20 (linear ramp) 

16-30 min 280°C 

Injector Temperature 275°C 

Detector Temperature 300°C. 



25 Transgenic Results 



Seeds from 9 AraJbidopsis lines transformed with 

pRD400 and 55 lines transformed with pSE129A were 
analy2ed for squalene content. Table 2 below shows the 
30 results for all of the pRD400 transgenic lines and 4 

pSE129A lines. 
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The mean and standard deviation of the 9 pRD400 lines is 
5 4.6 and 0.7, respectively. 
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(iii) NUMBER OF SEQUENCES: 11 

tiv) COMPUTER READABLE PORN: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPOTER: IBM PC cotr*>atible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

<D) SOFTWARE: Pateatln Release #1.0 f Version #1.3 0 ( 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1756 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA to mRNA 



(iii) HYPOTHETICAL: NO 
Uv) ANT I -SENSE : NO 



(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Arabidopais thaliana 

(B) STRAIN: Columbia 

(DJ DEVELOPMENTAL STAGE: 3 different stage* 
<F) TISSUE TYPE: 4 different tissues 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Lambda -PRL2 

(B) CLONE: 129F12T7 

fix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 15.. 15«S 

(D> OTHER INFORMATION: /codon^. tare 15 
/function- -converts squalene to 
2 , 3 -ox i do squalene " 
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/EC_mimber« 1.14.99.7 
/product- ■•qualwie epoxidaae- 
/standard_na«e- -squalene monooxygena 

(2 , 3-epoxidizing) " 

(ixj FEATURE: 

(A) NAME /KEY : 3 ' UTR 

(B) LOCATION: 1566. .1756 

(ix) FEATURE: 

(A) NAME/ KEY: polyA_aite 

(B) LOCATION: 1756 

fix) FEATURE: 

(A) NAME/KEY: 5 'UTR 

(B) LOCATION: 1 . .14 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CCACGCOTCC GGCA ATG ACT TAC GCG TGG TTA TGG ACG CTT CTC GCC TTT 

Met Thr Tyr Ala Trp Leu Trp Thr Leu Leu Ala Phe 
1 

GTT CTG ACA TGG ATG GTT TTT CAC CTC ATC AAG ATG AAG AAG GCG GCA 
Val I^u Thr Trp Met val phe H . a n# l ^ Mfit l ^ l ^ ^ ^ 

ACC GGA GAT TTA GAG GCC GAG GCA GAA GCA AGA AGA GAT GGT GCA ACG 
Thr Gly Aap Glu Ala Glu Ala Glu Ala Arg Arg Asp Gly Ala Thr 

GAT GTC ATC ATT GTT GGG GCG GGT GTT GCA GGC GCT TCT CTT GCT TAT 

Gly Ala Gly Val Ala Gly Ala Ser Leu Ala Tyr 

50 55 

" 60 



Asp Val lie He Val 
45 



GCT TTA GCT AAG GAT GGA CGA CGA GTA CAT GTG ATA GAG AGG GAC TTA 
Ala Leu Ala Lya Asp Gly Arg Axg Val His Val lie Glu Arg Aap Le U 



65 70 

/u 75 



50 



96 



146 



194 



242 
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105 

GC * CAA GAA GCG AAG TCC Tre m 

- - «. ». w. z r 

*"* »> u. «, 

120 

ACA TTG CCT TTT CCA OAT GAC AAG AGT TTT 

»~ - -o Phe Pro Aap ^ £ £ « «* CAT GAG CCA CTA 

" ^ P " Hi ' — v al Gly 

135 

14 0 

AGA CTC TTA rr~r x»« 
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- - - *. * «, zzz zz r ~- * — « 

145 Leu Ar 9 Gin Lys Al a 

150 

TO OT «c „ T OT 

- - - - ~ 0l „ ™ r ~ £ r rr - m ™ 

160 U Gly Thr Val Lya ser Le U 

165 

170 



386 



434 



482 



S30 



AGC GCA 



570 



Ser Al 



a 



ATT GAA QAA GAA QGA GTG CTC Ail 

»• «u «« «« « y Val " ^ ^ *<* AAA AAT 

VI Lys Gly val Thr ^ ^ ^ 

180 

185 

GGC GAA GAA ATa »^ 

ATA ACG GCC TTT OCX nnt* 

«- «« XI. T* r Us Phe ^ « ^ «C GTA TGC OAT GGT 

1M A-p Gly 

200 

TCT ™ ™ «C CTT CGT CGG TCA „ r 

S - A.„ UuAr3 ^ " C *AT ACT GAG GAA GTC 

«. " Val A8P As " oi« «u v.1 

2X5 

~' »' °* T„ „Z Tht ~ « " «« «T ,„ 

225 * er Ar 3 L«u Glu Aap 

230 F 

23S 



r 
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CCC CAT ACT CTA CAT TTG ATA TTT TCT AAA CCT TTG GTT TGT GTT ATA 
Pro His Ser Leu Hia Leu lie Phe Ser Lys Pro Leu Val Cys Val He 
240 245 2S0 



770 



TAT CAA ATA ACC AGT GAT GAA GTT CGT TGT GTT GCC GAA GTT CCC GCT 818 
Tyr Gin He Thr Ser Asp Glu Val Arg Cya Val Ala Glu Val Pro Ala 
255 260 265 



GAT AGT ATT CCT TCT ATA TCG AAT GGT GAA ATO TCT ACC TTC CTC AAG 866 
Asp Ser He Pro Ser He Ser Asn Gly Glu Met Ser Thr Phe Leu Lya 
270 275 280 



AAA TCA ATG GCT CCT CAG ATA CCT GAA ACT GGA AAT CTT CGG GAG ATA 914 
Lys Ser Net Ala Pro Gin He Pro Glu Thr Gly Asn Leu Arg Glu He 

285 290 295 300 



TTT TTG AAA GGC ATA GAG GAA GGA TTA CCA GAG ATA AAA TCA ACA GCG 962 
Phe Leu Lys Gly He Glu Glu Gly Leu Pro Glu He Lys Ser Thr Ala 

305 310 315 



ACG AAA AGT ATG TCA TCG AGA TTG TGT GAT AAA AGA GGA GTG ATT GTG 1010 
Thr Lya Ser Met Ser Ser Arg Leu Cys Asp Lya Arg Gly Val He Val 
320 325 330 



TTG GGA GAT GCA TTC AAT ATG CGT CAT CCT ATA ATC GCG TCA GGA ATG 1058 
Leu Gly Asp Ala Phe Asn Met Arg His Pro He He Ala Ser Gly Met 
335 340 345 



ATG GTT GCA CTC TCG GAC ATT TGC ATT CTA CCC AAT CTT CTC AAA CCA 1106 
Met Val Ala Leu Ser Asp He Cys He Leu Arg Asn Leu Leu Lys Pro 

350 355 360 



TTG CCT AAC CTC AGC AAT ACT AAG AAA GTC TCT GAT CTT GTC AAG TCC 1154 
Leu Pro Asn Leu 9er Asn Thr Lys Lys Val Ser Asp Leu Val Lys Ser 

365 370 375 380 



TTT TAC ATC ATC CGC AAG CCA ATG TCA GCG ACC GTG AAC ACG CTC GCG 
Phe Tyr He He Arg Lys Pro Met Ser Ala Thr Val Asn Thr Leu Ala 

385 390 395 



1202 
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AGT ATC TTT TCA CAA GTG CTT GTT GCT ACA ACA GAC GAA GCA AGA GAG 
Ser He Phe Ser Gin Val Leu Val Ala Thr Thr Asp Glu Ala Arg Glu 
400 405 410 



GGA TTC AAG GAA ATG TTG ATT CCA ACA AAC GCA GCT GCT TAT CGA AGG 
Gly Phe Ly. Glu Met Leu lie Pro Thr Asn Ala Ala Ala Tyr Arg Arg 



4 *S 500 



505 



AAC TAT ATC GCC ACA ACC ACT GTT TGA TCAATCCATA ACACGAAGAC 
Aan Tyr He Ala Thr Thr Thr Val 

510 515 



1250 



1298 



1346 



1394 



GGA ATG CGA CAA GGC TGC TTC AAT TAC CTA GCT CGT GGA GAT TTT AAA 
Gly Met Arg Gin Gly Cya Phe ^ n Tyr Leu Ala Arg Gly Aap Phe Lys 
415 420 425 

ACA AGG GGA TTG ATG ACT ATT CTC GGA GGC ATG AAC CCT CAC CCT CTT 
Thr Arg Gly Leu Met Thr lie Leu Gly Gly Met Aan Pro Hia Pro Leu 
430 4 " 440 

ACT CTA GTC CTT CAT CTT GTA GCC ATC ACC CTT ACQ TCC ATG GGC CAC 
Thr Leu Val Leu Hia Leu Val Ala lie Thr Leu Thr Ser Met Gly His 

TTG CTC TCT CCG TTT CCT TCQ CCT CGT CGC TTT TOO CAT AGC CTC AGA 
Leu Leu Ser Pro Phe Pro Ser Pro Arg Arg Phe Trp His Ser Leu Arg 

465 470 475 

ATT CTT GCC TGG GCT TTG CAA ATG TTG GGT GCA CAT TTA GTG GAT GAA 1490 
He L.u Ala Trp Ala Leu Gin Met Leu Gly Ala Hia Leu Val Aap Glu 
480 <85 490 



1442 



1538 



1585 



TGTTTTATTC GGAGATUAAA AATAACAACT CAAACAGTTA ACTTTCTACA ACCAAATAAA i 645 
TAATTGTGTG TATATOAAGT TGAGCCTATG GTTAAGCTCT ACTGAATTGT GTTGAAAACA 1705 
AACATGGATA TGTTATATGC TAATTTGTTA TATTCTATTT ATTGATTCTT G 



17S6 



(2) IKPORMATI0N FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Tyr Ala Trp Leu Trp Thr Leu Leu Ala Phe Val Leu Thr Trp 

15 10 15 

Met Val Phe His Leu lie Lye Met Lys Lya Ala Ala Thr Gly Asp Leu 

20 25 30 

Glu Ala Glu Ala Glu Ala Arg Arg Asp Gly Ala Thr Asp Val lie lie 
35 40 45 

Val Gly Ala Gly Val Ala Gly Ala Ser Leu Ala Tyr Ala Leu Ala Lys 
50 55 60 

Asp Gly Arg Arg Val Hi a Val He Glu Arg Asp Leu Lya Glu Pro Gin 

65 TO 75 B0 

Arg Phe Met Gly Glu Leu Met Gin Ala Gly Gly Arg Phe Met Leu Ala 

85 90 95 

Gin Leu Gly Leu Glu Asp Cys Leu Glu Asp He Asp Ala Gin Glu Ala 
100 105 110 

Lya Ser Leu Ala He Tyr Lys Asp Gly Lys His Ala Thr Leu Pro Phe 

115 120 125 

Pro Asp Asp Lys Ser Phe Pro His Glu Pro Val Gly Arg Leu Leu Arg 
130 135 140 

Asn Gly Arg Leu Val Gin Arg Leu Arg Gin Lye Ala Ala Ser Leu Ser 

145 150 155 160 



Asn Val Gin Leu Glu Glu Gly Thr Val Lye Ser Leu He Glu Glu Glu 



t 
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165 170 



175 



Gly V.l v.l Ly Cly V.l Thr Tyr Lya A.„ Ser Ala Gly Glu Glu He 
180 190 

Thr Al. Phe Al. Pro Leu Thr V.l V.l Cys A.p Gly Cya Tyr Ser Aan 
195 20 ° 205 

Leu Arg Arg Ser Leu Val Aap Aan Thr Glu Glu V.l Leu Ser Tyr Met 
210 215 220 

Val Gly Tyr V.l Thr Lya Aan Ser Arg Leu Glu Aap Pro Hi. Ser Leu 

22S 230 23S 

•* JS 240 

His Leu lie Phe Ser Ly. Pro Leu V.l Cys V.l He Tyr Gin He Thr 

245 250 255 



Ser Asp Glu V.l Arg Cy. Val Ala Glu V.l Pro Ala Asp Ser II 

270 



e Pro 

260 265 



Ser lie Ser Aan Oly Glu Met Ser Thr Phe Leu Ly, Ly 8 Ser Met Ma 



275 



280 



285 



Pro Gin He Pro Glu Thr Gly Asn Leu Arg Glu lie Phe 



290 



Leu Lya Gly 



295 



300 



II. Glu Glu Gly Leu Pro Glu lie Ly. Ser Thr Al. Thr Lya Ser Met 

305 315 

Jlb 320 



Ser Ser Arg Leu Cy* Aap Lya Arg 



Gly Val lie val Leu Gly Asp Ala 



325 330 

JJ0 335 



Phe A,„ Met Are Hi. Pro He He Al. Ser Gly Met Met V.l Al. Leu 
S.r Aap lie Cys H, Leu Arg Aan Leu Leu Ly. Pro Leu Pro Asn Leu 

Ser A,„ Thr Ly. Ly. V.l S er Aap Leu V.l Ly. Ser Ph. Tyr lie n . 

370 
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Arg Lye Pro Met Ser Ala Thr Val Asn Thr Leu Ala Ser He Phe Ser 

385 390 395 400 

Gin Val Leu Val Ala Thr Thr Asp Glu Ala Arg Glu Gly Met Arg Gin 

405 410 415 

Gly Cye Phe Asn Tyr Leu Ala Arg Gly Asp Phe Lys Thr Arg Gly Leu 
«20 425 430 

Met Thr He Leu Gly Gly Met Asn Pro His Pro Leu Thr Leu Val Leu 
435 440 445 

His Leu Val Ala He Thr Leu Thr Ser Met Gly His Leu Leu Ser Pro 
450 455 460 

Phe Pro Ser Pro Arg Arg Phe Trp Hia Ser Leu Arg He Leu Ala Trp 
465 470 475 480 

Ala Leu Gin Met Leu Gly Ala Hi* Leu Val Asp Glu Gly Phe Lys Glu 

485 490 495 

Met Leu He Pro Thr Asn Ala Ala Ala Tyr Arg Arg Asn Tyr He Ala 
500 505 510 

Thr Thr Thr Val 
515 

<2) INFORMATION FOR SEQ ID NO : 3: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1748 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(li) MOLECULE TYPE: cDNA to mRNA 



Uii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Brasaica napus 

(B) STRAIN: We a tar 

(D) DEVELOPMENTAL STAGE: 14 day greening-etiolated 
(F) TISSUE TYPE: faypocotyla 

(viij IMMEDIATE SOURCE: 

(A) LIBRARY: Taang 

(B) CLONE: pDRlll 

tix) FEATURE: 

(A) NAME/KEY: 5'UTR 

(B) LOCATION: 1 . . 18 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(BJ LOCATION: 19. . 1575 

(ix) FEATURE : 

(A) NAME/KEY: 3 * UTR 

(B) LOCATION: 1576. .1748 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCACGCQTCC GAAAAGAT ATG GAT ATG OCT TTT GTG GAA GTT TGT TTA CGG 

Met Asp Met Ala Phe Val Glu Val Cys Leu Arg 
520 

ATG CTA CTT GTC TTC GTA CTG TCT TGG ACG ATA TTT CAC GTC AAC AAC 
Met Leu Leu Val Phe V.l Leu Ser Trp Thr lie Phe His Val Aan Aan 
530 535 540 

AGG AAG AAG AAG AAG GCG ACG AAG TTG GOG GAT CTG GCT ACT GAG GAG 
Arg Lya Lya Ly. Ly. Ala Thr Ly. Leu Ala Asp Leu Ala Thr Glu Glu 

550 555 560 



51 



99 



147 
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AGA AAA OAA GOT GGC CCT OAC GTC ATA ATA GTC GQA GCT GGA GTG GGC 195 
Arg Lys Glu Gly Gly Pro Asp Val lie lie Val Gly Ala Gly Val Gly 

S6S 570 575 

GGC TCA GCT CTC GCC TAT GCT CTT GCT AAG GAC GGG CGT CGA GTA CAT 243 
Gly Ser Ala Leu Ala Tyr Ala Leu Ala Lys Aap Gly Arg Arg Val His 
580 585 590 

GTG ATA GAA AGA GAC ATG AGA GAG CCA GTG AGA ATG ATG GGT GAG TTC 291 
Val lie Glu Arg Aap Met Arg Glu Pro Val Arg Met Met Gly Glu Phe 
595 600 60S 

ATG CAG CCA GGA GGA CGG CTC ATG CTT TCT AAG CTC GGT CTT CAA GAT 339 
Met Gin Pro Gly Gly Arg Leu Met Leu Ser Lyo Leu Gly Leu Gin Aap 
610 615 620 

TOT TTA GAG GAA ATA GAC GCA CAG AAA TCC ACC GGC ATA AGA CTT TTT 3 87 

Cys Leu Glu Glu lie Aap Ala Gin Lya Ser Thr Gly lie Arg Leu Phe 
625 630 635 640 

AAG GAC GGA AAA GAA ACT GTC GCA TGT TTT CCG GTG GAC ACC AAC TTT 4 35 

Lys Aap Gly Lya Glu Thr Val Ala Cys Phe Pro Val Asp Thr Asn Phe 

645 650 655 

CCT TAT GAA CCA TCT GGT CGA TTT TTT CAC AAT GGC CGT TTT GTC CAG 4B3 
Pro Tyr Glu Pro Ser Gly Arg Phe Phe His Asn Gly Arg Phe Val Gin 
660 665 670 

AGA CTG CGC CAA AAG GCC TCT TCT CTT CCC AAT GTG CGG CTG GAA GAA 531 
Arg Leu Arg Gin Lys Ala Ser Ser Leu Pro Asn Val Arg Leu Glu Glu 
675 680 685 

GGG ACC GTC CGA TCT TTG ATA GAA GAA AAA GGA GTG GTC AAA GGA GTG 57 9 

Gly Thr Val Arg Ser Leu lie Glu Glu Lys Gly Val Val Lys Gly Val 
690 695 700 



ACA TAC AAG AAC ACT TCA GGG GAA GAA ACC ACA TCA TTT GCA CCT CTC 
Thr Tyr Lys Asn Ser Ser Gly Glu Glu Thr Thr Ser Phe Ala Pro Leu 

70S 710 715 720 



627 
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ACT GTC OTA IGC GAT COT TCC CAC TCG AAC CTT COT CGC TCT CTA AAT 
Thr Val v«l Cy. Asp Gly cy. Hi. Ser Asn Leu Arg Arg Ser Leu Asn 

725 730 735 

GAC AAC AAT OCG GAG CTT ACQ GCG TAC GAG ATT GGT TAC ATC TCG AGG 
Aap Asn Asn Ala Glu Val Thr Ala Tyr Glu lie Gly Tyr He Ser Arg 
740 7« 750 

AAT TGT CGC CTT GAA CAG CCC GAC AAG TTA CAC TTG ATA ATC OCT AAA 
Aan Cy. Arg Leu Glu Gin Pro A.p Lys Leu Hi. Leu i le Met Ala Ly. 

755 760 76s 

CCG TCT TTC GCC ATG TTG TAT CAA GTC AGC AGC ACC GAC GTT CGT TGT 
Pro Ser Phe Ala Met Leu Tyr Gin Val Ser Ser Thr Aap Val Arg Cy. 
770 775 780 

AAT TTT GAG CTT CTC TCC AAA AAT CTT CCT TCT GTT TCA AAT GGT GAA 
A.n Phe Glu Leu Leu s.r Lye Asn Leu Pro Ser Val Ser Aan Gly Glu 

800 



785 790 79S 



ATG ACQ TCC TTC GTG AGG AAC TCT ATT GCT CCC CAG GTA CCT CTA AAA 
Met Thr Ser Phe Val Arg Aan Ser Ile Ala Pro Gin Val Pro Leu Ly. 

815 



80S 810 



CTC CGC AAA ACA TTT TTG AAA GGG CTC GAT GAG GGA TCA CAT ATA AAA 
Leu Arg Ly. Thr Phe Leu Ly, Gly Leu Aap Glu Gly Ser Hi. lie Ly. 
820 825 830 

ATT ACA CAA GCA AAG CGC ATC CCA GCT ACT TTG AGC AGA AAA AAG GGA 
lie Thr Gin Ala Ly. Arg lie Pro Ala Thr Leu Ser Arg Ly. Ly. Gly 
83S 8<0 845 

GTG ATT GTG TTG GGA GAT GCA TTC AAC ATG CGT CAT CCC GTA ATC GCG 
Val II. val Le U oly A.p Ala Phe A»n Met Arg Hi. Pro Val He Ala 
850 855 8*0 

TCG GGG ATG ATG GTT TTA TTG TCT GAC ATT CTC ATT CTA AGC CGT CTT 
Ser Gly Met Met Val Leu Leu Ser Aap lie Leu lie Leu Ser Arg Leu 
865 970 875 880 



675 



723 



771 



819 



867 



915 



963 



1011 



1059 



1107 
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C7TC AAG CCT TTG GGC AAC CTC GGT GAT GAA AAC AAA GTC TCA GAA GTT 
Leu Lys Pro Leu Gly Asn I*u Gly Asp Glu Asn Lys Val Ser 01ll Val 

ATG AAG TCC TTC TAT OCT CTA CGC AAG CCA ATO TCA GCA ACA GTA AAC 
Met Lye Ser Phe Tyr Al. L.u Arg Ly. Pro Met Ser Al, Thr Val A.n 

900 905 910 

ACA CTA GGG AAT TCA TTT TGG CAA GTG CTA ATT GCT TCA ACQ QAC GAA 
Thr Leu Gly A.„ Ser Phe Trp Gin Val Leu lie Ala ser Thr Aap Glu 

GCA AAA GAG GCC ATG CGA CAA GOT TGC TTT GAT TAC CTC TCT ACT GGT 
Al- Ly. Glu Al. Met Arg Gin Gly Cy. Phe A .p Tyr Leu ser Ser Gly 
"° «5 940 

GGG TTT CGC ACG TCA GGC TTG ATG GCT CTG ATT GGT GGC ATG AAC CCT 
Gly Phe Arg Thr Ser Gly Leu Met Al. Leu lie Gly Gly Met A,„ Pro 

945 950 ocr 

555 9«0 

AGG CCA CTT TCT CTC TTC TAT CAT CTA TTC GTT ATT TCT TTA TCC TCC 
Arg Pro Leu Ser Leu Phe Tyr His Leu Phe Val n e Ser Leu Ser Ser 

965 970 975 

ATT GGC CAA CTG CTC TCT CCA TTC CCC ACT CCT CTT CGT GTT TGG CAT 
He Gly Gin Leu Leu Ser Pro Phe Pro Thr Pro Leu Arg Val Trp His 
980 985 990 

AGC CTC AGA CTT CTT GAT TTG TCT TTG AAA ATG TTG GTT CCT CAT CTC 
Ser Leu Arg Leu Leu Asp Leu Ser Lau Lys Met Leu VI Pro His Leu 
" 5 10 <>0 1005 

AAG GCC GAA GGA ATA GGT CAA ATG TTG TCT CCA ACA AAT GCA GCG GCG 
Lys Ala Glu Gly lie Gly Gin Met Leu Ser Pro Thr Asn Ala Ala Al. 

1010 

TAT CGC AAA AGC TAT ATG GCT GCA ACC GTT GTC TAG ACATTGATGA 
Tyr Arg Lys Ser Tyr Met Ala Ala Thr Val Val 

1030 1035 



1155 



1203 



1251 



1299 



1347 



1395 



1443 



1491 



1539 



1585 
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AATATAGATG GTGCACAAAT CTTTQTGATT GTGGATTTGT GAAAATAGTA TTGCAATATG 164 S 
TTACTGAAGA AACTTTTCCT TATCCACTTA TAAGTGGAAA TAGGAAGAAT GTGTATATAT 1705 
GTAAGGGGTG ACAATTATTT TGAAATAAAA TTAAGAAAAT AAC 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 518 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(ii> MOLECULE TYPE : protein 

{Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met Asp Met Ala Phe Val Glu Val Cys Leu Arg Met Leu Leu 

15 



Val Phe 

1 5 10 



Val L*u Ser Trp Thr He Phe Hi 8 Val Asn Asn Arg Lys Lys Lys Lys 

30 



20 25 



Ala Thr Lys Leu Ala Asp Leu Ala Thr Glu Glu Arg Lys Glu Gly Gly 
35 40 45 

Pro Asp Val lie He Val Gly Ala Gly Val Gly Gly Ser Ala Leu Ala 

" SO 

Tyr Ala Leu Ala Lys Asp Gly Arg Arg Val His Val lie Glu Arg Asp 

65 70 75 

/b 80 

Met Arg Glu Pro Val Arg Met Met Gly Glu Phe Met Gin Pro Gly Gly 



85 



90 



95 



Arg Leu Met Leu Ser Lye Leu Gly Leu Gin Asp Cys Leu Glu Glu lie 
Asp Ala Gin Ly. Ser Thr Gly lie Arg Leu Phe Lys Asp Gly Lys Glu 



115 



120 



125 
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Thr Val Ala cys Phe Pro Val Asp Thr Asn Phc Pro ^ Qlu 

130 , , c * cr 

1Jb 140 

«y Arg Phe Phe His oiy Ar 3 Phe Val Gln ^ ^ ^ ^ 

i:s;> 160 
Ala ser Ser Leu Pro Aan v.l Arg Leu 01u Glu Gly Thr ^ ^ 

Leu He Glu Glu L y8 Gly val v,l Lys Gl y Val Thr Tyr Ly8 Ser 

Ser oiy Glu Glu Thr Thr Ser Phe Ala Pro Leu Thr Val Val Cy 3 ^ 

* uo 205 

Gly Cye His Ser Asn Leu Arg Arq ser l* u * 

g Arg ber Leg Asn Asp Asn Asn Ala Glu 

210 21S 

220 

Val Thr Ala Tyr Glu lie Gly TVr <z-r » ^ 

y iyr Iie Scr Ar 9 Asn Cys Arg Leu Glu 

Gin Pro Asp Ly. L.u Hi. Leu n. M .t Al. Ly. Pro Ser p he Ala Met 
I-u Tyr Gin V.l Ser S.r Thr A*p v.l Arg Cy, Asn P he Glu Leu L.u 

260 «s 270 

Ser Lys As„ Leu Pro Ser Val Ser Asn Gly Glu Met Thr Ser Phe Val 

275 

Arg As„ Ser lie Al. Pro Qlo v .l Pro Leu Lys Leu Arg ^ 

Lys Gly Leu Asp Glu Gly Ser His He Lys lie Thr Gin Ala Lys 

Arg He Pro Ala Thr Leu Ser Arg Lys Ly. Gly V.l He V.l L . u Gly 

325 
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Aap Ala Phe Asn Hec Arg His Pro Val lie Ala Ser Gly Met Met Val 

350 



340 345 



Leu Leu Ser Asp He Leu He Leu Ser 
35 5 360 



Arg Leu Leu Lys Pro Leu Gly 
365 



Asn Leu Gly Asp ciu Aen Lya Val Ser Glu Val Met Lya Ser Phe Tyr 



370 



375 



380 



Ala Leu Arg Lys P ro Met Ser Ala Thr Val Asn Thr Leu Gly ^ S€r 



365 



390 



395 



400 



Phe Trp Gin Val Leu lie Ala Ser Thr Asp Glu Ala Lys Glu Ala Met 

*°5 410 



415 



Ser Ser Gly Gly Phe Arg Thr Ser 
420 425 



Arg Gin Gly Cya Phe Asp Tyr Leu 

430 



Gly Leu Met Ala Leu lie Gly cly Met ^ Pro ^ prQ ^ ^ ^ 



435 



440 



445 



Phe Tyr His Leu Phe V,! n, Ser ^ Ser Ser n . Qly cln ^ ^ 



4S0 



45S 



460 



Ser Pro Phe Pro Thr Pro Leu Arg Val Trp Hi, Ser Leu Arg Le u Leu 



465 



470 



475 



480 



He 



A-p Leu Ser Leu Lys Met Leu Val Pro His Leu Lys Ala Glu Gly 

485 490 495 

Gly Gin Met Leu Ser Pro Thr Asn Ala Ala Ala Tyr Arg Lys Ser Tyr 



500 

Met Ala Ala Thr Val Val 

515 



<2) INFORMATION FOR SBQ ID NO: 5: 



505 



510 



(il SEQUENCE CHARACTERISTICS: 
(A) LENGTH; 1892 base pai 
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CB) TYPE : nucleic acid 

(C) STRAND BDNBSS : double 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL : NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Basaica napus 

(B) STRAIN: Hestar 

(D) DEVELOPMENTAL STAGE: 14 day greening-etiolated 
(F) TISSUE TYPE: hypocotyla 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Taang 

(B) CLONE: pDR411 

(ix) FEATURE: 

<A) NAME / KEY : 5*UTR 
(B) LOCATION :1. .28 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:29. .1466 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION : 1467 . .1623 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1624 16 97 

(ix) FEATURE: 

(A) NAME/ KEY : 3'UTR 

(B) LOCATION: 16 98 189 3 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CCACGCGTCC GCGGACGCGT GCCCAGATAT GGATCTAGCT TTTCCGCACG TTTGTTTQTG 
GACGCTACTC GCCTTTGTGC TGACITGGAC AGTGTTCTAC GTCAACAACA GGAGGAAGAA 120 
GGTGGCGAAG TTACCCGATG CGGCGACAGA GGTGAGAAGA GACGGTGATG CTGACGTCAT 
CATCGTCGGA GCTGGTGTTO GAGGTTCAGC TCTCGCCTAC GCTCTTGCAA AGOATGCGCG 24 0 
TCGAGTACAT OTOATAGAGA GCGACATOAG GGAACCAGTG AGAATGATQG GTGAATTTAT 
GCAACCCGCT GGACGACTAC TGCITTCTAA GCTTGGTCTT GAAGATTGTT TGGAGGGAAT 360 
AGATGAACAG ATAGCCACAG GCTTAGCAGT TTATAAGGAC GGACAAAAAG CACTOGTGTC 
TTTTCCAGAG GACAACGACT TTCCTTATGA ACCTACTGGT CGAGCTTTTT ATAATGGCCG 
TTTTGTCCAG AGACTCCCCC AAAAOGCTTC TTCGCTCCCC ACTGTACAAC TTGAAGAAGG 
GACTGTAAAA TCTTTGATAG AAGAAAAAGG AGTGATCAAA GGAGTGACAT ACAAGAATAG 60 0 
TGCAGGCGAA GAAACQACTG CATTTGCACC TCTCACAGTG GTATGCGACG GTTGCTATTC 
AAACCTTCGT CGGTCTGTTA AOQACAACAA TGCGGAGGTT ATATCGTACC AAGTTOGTTA 720 
CGTCTCAAAG AATTGTCAGC TTGAAGATCC TGAAAAGTTA AAATTGATAA TGTCTAAACC 7, 

rrccrrcAcc atottgtatc aaataaccag caccgatctt cgttgtgtta tggagatttt 8 < 0 

CCCCGCCAAT ATTCCTTCTA TTTCAAATGG CCAAATGGCT GTTTATTTGA AAAATACTAT 



^CCTCAG GTACCTCCAG AACTCOGCAA AATATTTTTG AAAGGAATTG ATGAGGGAGC 



80 



900 



960 



ACAAATTAAA GCGATGCCAA CAAAGAGAAT GGAAGCTACT TTGAGCGAAA AGCAAGGAGT 1020 
GATTGTGTTG GQAGATGCAT TCAACATGCG CCACCCAGCO ATTGCCTCTG GAATCATGGT 108 0 
TGTATTATCT GACATTCTCA TTCTACGCCG CCTTCTCCAC CCATOCGAA ACCTCAGTGA 110 
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TGCAAATAAA GTATCAGAAG TTATTAAGTC ATTTTATGTC ATCCGAAAGC CAATGTCAGC 1200 

GACGGTGAAC ACGCTAGGAA ATGCATTTTC TCAAGTGCTA ATTGCATCTA CGGACGAAGC 1260 

AAAAGAAGCG ATGCGACAAG GCTGTTTTGA TTACCTCTCT AOTGGCGGCT TTCGCACGTC 13 20 

AGGAATGATG GCTCTGCTCG GTGGCATGAA CCCTCGACCA CTCTCTCTCA TCTTTCATCT 13 80 

ATGTGGTATT ACTCTATCCT CCATTGGTCA ACTGCTCTCG CCATTTCCAT CTCCTCTTGG 1440 

CATTTGOCAT AQCCTCAGAC TTTTT O GTQT AAGTCATTAT CTCCCTCCCT ATGTTATTTA 1500 

CATATTTTTC TTTGTGTTAT ATATTTTGTA AATAATTTAC AATTGAATTT TGACATTTTC 1560 

TTGTTGTTTA TGTGTATGCC TAATTGTCTA TQAAAATGTT GGTTCCTCAT CTTAAGGCTG 1620 

AAGGOGTTAG CCAAATGCTG TCTCCAGCAT ACGCAGCCGC GTATCGCAAA AGCTATATGA 16 80 

CCGCAACCGC TCTCTAAGCA TCGATGATAA GAACCGCGAA TGATACTATG ACATATTTGG 1740 

AGCGCTAGTA TTTTGTGGTT TTGCATCCGT TAAAAATTTA AAATGTGTTG CTGTGTGTTT 1800 

ACTATTATTA GTGTATTACC TGGAAAATAC CCGTGGGTAT ATTCTAAATG TATAAAATAT 186 0 

TGTGATAAAT AAAACGACTC TCCGTTTGGT TGG 18 93 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5*72 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDMESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hus Musculus 

(B) STRAIN: B6CBA 
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(D) DEVELOPMENTAL STAGE: 6-8 weeks 
(F) TISSUE TYPE: liver 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY : Lambda ZAP vector Stratagene catalog #935302 
<B) CLONE: pKMSE-17 



(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Kosuga, K. 

Hata, S. 
Oaurei , T . 

Sakakibara , J . 
Ono, T. 

(B) TITLE: Nucleotide sequence of a cDNA for mouse 

aqualene epoxidase 

(C) JOURNAL: Biochim. Biophys . Acta 

(D) VOLUME: 1260 

(E) ISSUE: 3 

<F) PAGES: 345-348 
(G) DATE : 1995 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : € : 

H.t Trp Thr Phe Leu Oly ll e Ala Thr Phe Thr Tyr Phe Tyr Lys Lys 

5 10 15 

Cy. Gly A«p Val Thr Leu Al* A.n Lya clu Leu Leu Leu Cys V.l Leu 

Val Phe Leu Ser Leu Gly Leu Val Leu iw » ^ 

y i-«u vai beu Ser Tyr Arg Cys Arg Hi a Arg 

Hi- Oly Gly Leu Leu Gly Arg Hi. Gin Ser Gly Al, Gin Phe Al. Ala 

Phe Ser A.p lie Le U Ser Al, Leu Pro Leu II. Gly Phe Phe Trp Ala 

65 70 

75 80 
Ly. Ser Pro Glu Ser Blu Ly . Lys clu Qln ^ ^ ^ ^ ^ ^ 
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Arg Lys Glu lie Gly 

100 

Ser Val Ser Thr Ser 

IIS 

Ser Gly Val Leu Gly 
130 

Arg Lys Val Thr Val 
145 

Val Gly Glu Leu Leu 

165 

Gly Leu Gly Aap Thr 
180 

Tyr He Val His Aap 

195 

Pro Leu Ser Glu Thr 
210 

Gly Arg Phe He Met 
225 

Val Lys Phe He Glu 

245 

Ala Val He Gly Val 
260 

Glu Leu His Ala Pro 
275 

Phe Arg Lys Ser Leu 
290 



-48- 

90 

Leu Ser Glu Thr Thr Leu 

105 

Phe Val Thr Asp Pro Glu 
120 

Ser Ala Leu Ala Ala Val 
135 

He Glu Arg Asp Leu Lys 
150 155 

Gin Pro Gly Gly Tyr Arg 

170 

Val Glu Gly Leu Aan Ala 
185 

Tyr Glu Ser Arg Ser Glu 
200 

Asn Gin Val Gin Ser Gly 
215 

Ser Leu Arg Lys Ala Ala 
230 235 

Gly Val Val Leu Gin Leu 

250 

Gin Tyr Lye Aap Lys Glu 
265 

Leu Thr Val Val Ala Asp 
280 

He Ser Ser Lys Val Ser 
295 



Thr Gly Ala Ala Thr 

110 

Val He He Val Gly 
125 

Leu Ser Arg Asp Gly 
140 

Glu Pro Asp Arg He 

160 

Val Leu Gin Glu Leu 
175 

His His He His Gly 

190 

Val Gin He Pro Tyr 
205 

He Ala Phe His His 
220 

Met Ala Glu Pro Asn 

240 

Leu Glu Glu Asp Aap 
255 

Thr Gly Asp Thr Lys 
270 

Gly Leu Phe Ser Lys 
285 

Val Ser Ser His Phe 
300 
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Val Gly Phe Leu Met Lys Asp Ala Pro Gin Phe Lys Pro Asn Phe Al 



305 310 315 



a 

320 



Glu Leu Val Leu Val Asn Pro Ser Pro Val Leu lie Tyr Gin lie Ser 

335 



325 330 



Ser Ser Glu Thr Arg Val l^u Val Asp lie Arg Gly Glu Leu Pro Arg 

3S0 



340 345 



Asn Leu Arg Glu Tyr Met Ala Glu Gin He Tyr Pro Gin Leu Pro Glu 
355 360 365 

His Leu Lys Glu Ser Phe Leu Glu Ala Ser Gin Asn 



370 375 



Gly Arg Leu Arg 
380 



Thr Met Pro Ala Ser Phe Leu Pro Pro Ser Ser Val Asn Lys Arg Gly 



385 390 39S 



400 



Val Leu lie Leu Gly Asp Ala Tyr Asn Leu Arg His Pro Leu Thr Gly 



4 °S 410 



415 



Gly Gly Met Thr Val Ala Leu Lys Asp lie Lys Leu Trp Arg Gin 



420 425 



Leu 
430 



Leu Lys Asp lie Pro Asp Leu Tyr Asp Asp Ala Ala lie Phe Gin Ala 

440 445 



Lys Lys Ser Phe Phe Trp S 

460 



er Arg Lys Arg Thr His Ser Phe Val Val 
450 4S5 



Asn Val Leu Ala Gin Ala Leu Tyr Glu Leu Phe Ser Ala Thr 



465 



Asp Asp 



470 



475 



480 

Ser Leu His Gin Leu Arg Lys Al. 

j - * jjjf a 

495 



■a Cys Phe Leu Tyr Phe Lys Leu Gly 
485 4 90 



Gly Glu Cy B val Thr Gly Pro V.l oly Leu Leu Ser lie Leu Ser Pro 

500 SOS 
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Hia Pro Leu Val Leu lie Arg His Phe Phe Ser Val Ala lie Tyr Ala 
515 S20 525 

Thr Tyr Phe Cys Phe Lys Ser Glu Pro Trp Ala Thr Lys Pro Arg Ala 
530 535 540 

Leu Phe Ser Ser Gly Ala Val Leu Tyr Lye Ala Cys Ser lie Leu Phe 
545 550 555 560 

Pro Leu He Tyr Ser Glu Met Lys Tyr Leu Val His 

56B 570 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS : 

CD) TOPOLOGY: linear 

lii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

< iv) ANT I -SENSE : NO 

(vi) ORIGINAL SOURCE . 

(A) ORGANISM: Rattus norvegicus 
<F) TISSUE TYPE: kidney 
<H) CELL LINE: NRX 

(vii> IMMEDIATE SOURCE: 

(A) LIBRARY: pcD2 library of H. Okayama 

(B) CLONE: Tb-1 



(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Sakakibara. J. 

Watanabe , R - 
Kanai, R. 
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Ono, T. 

(B) TITLE: Molecular cloning and expression of rat 

•qalene epoxidase 

(C) JOURNAL : J. Biol. Chain. 
(DJ VOLUME: 270 

(E) ISSUE : 1 

(F) PAGES: 17-20 

(G) DATE: 1995 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Trp Thr Phe Leu Gly He Ala Thr Phe Thr Tyr Phe Tyr Lys Lys 

15 



1 5 io 



Cys Gly Asp Val Thr Leu Ala Asn Lys Glu Leu Leu Leu Cys Val Leu 
20 25 3 0 

Val Phe Leu Ser Leu Gly Leu Val Leu Ser Tyr Arg Cys Arg His Arg 
35 40 45 

Asn Gly Gly Leu Leu Gly Arg His Gin Ser Gly Ser Gin Phe Ala Ala 
50 55 go 



Phe Ser Asp He Leu Ser Ala Leu Pro Leu II 

€5 70 7S 



e Gly Phe Phe Trp Ala 

80 



Lys Ser Pro Pro Glu Ser Glu Lys Lys Glu Gin Leu Glu 



Ser Lys Arg 



85 90 95 

Arg Arg Lys Glu Val Asn Leu Ser Glu Thr Thr Leu Thr Gly Ala Ala 

100 "5 110 

Thr Ser Val Ser Thr Ser Ser Val Thr Asp Pro Glu Val He He He 
115 120 125 

Gly Ser Gly Val Leu Gly Ser Ala Leu Ala Thr Val Leu Ser Arg Asp 
130 135 140 

Gly Arg Thr Val Thr Val lie Glu Arg Asp Leu Lys Glu Pro Asp Arg 
145 150 155 160 
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He Leu Gly Olu 

Leu Gly Leu Gly 
180 

Gly Tyr Val He 
195 

Tyr Pro Val Ser 
210 

His Gly Lys Phe 
225 

Asn Val Lys Phe 

Asp Ala Val He 

260 

Lys Glu Leu Hie 

275 

Lys Phe Arg Lys 

290 

Ph« Val Gly Phe 

305 

Ala Glu Leu Val 



Ser Pro Ser Glu 
340 

Arg Asn Leu Arg 



Cy» Leu Gin Pro 
165 

Asp Thr Val Glu 

His Asp Cys Olu 
200 

Glu Asn Asn Oln 

215 

He Met Ser Leu 
230 

He Glu Gly Val 
245 

Gly Val Gin Tyr 

Ala Pro Leu Thr 

280 

Asn Leu He Ser 

295 

Ha Net Lys Asp 

310 

Leu Val Asp Pro 

325 

Thr Arg Val Leu 



Glu Tyr Met Thr 
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Gly Gly Tyr Arg 
170 

Ser Leu Asn Ala 
185 

Ser Arg Ser Glu 

Val Gin Ser Gly 
220 

Arg Lys Ala Ala 

235 

Val Leu Arg Leu 
250 

Lys Asp Lys Glu 
265 

Val Val Ala Asp 

Asn Lys Val Ser 

300 

Ala Pro Gin Phe 

315 

Ser Pro Val Leu 
330 

Val Asp He Arg 
345 

Glu Gin He Tyr 



Val Leu Arg Glu 

175 

Mis His He His 
190 

Val Gin He Pro 
205 

Val Ala Phe His 

Met Ala Glu Pro 
240 

Leu Glu Glu Asp 
255 

Thr Gly Asp Thr 
270 

Gly Leu Phe Ser 
285 

Val Ser Ser His 



Lys Ala Asn Phe 

320 

He Tyr Gin He 
335 

Gly Glu Leu Pro 
350 

Pro Gin He Pro 
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*»P Hi, Leu Lys Glu Ser Ph. Leu Glu Ala Cys Gin A«n Ala Arg Leu 

370 "5 380 

Arg Thr Met Pro Ala Ser Phe Leu Pro Pro Ser Ser Val Afln Lya Arg 



38S 



Gly V.l Leu Leu Leu Gly Asp Ala Tyr A3n Leu Arg Hia Pro Leu Thr 

405 «« 415 

Gly Gly Gly Met Thr Val Al- Leu Lys A»p n. Lys He Trp Arg Gin 

*20 4 ir 

425 430 

Leu Leu Ly. Asp He Pro Asp Leu Tyr Asp Asp Ala Ala He Phe Gin 
435 «« 445 

Ala Lys Lys Ser Phe Phe Trp Ser Arg Lys Arg Ser His Ser 



450 455 



Phe Val 
460 



Val Asn Val Leu Ala Gin Ala Leu Tyr Glu Leu Phe Ser Ala Thr 

46S * 7 0 47 c 

475 480 



Asp 



Asp Ser Leu Arg Gin Leu Arg Lys Ala Cys ^ 

495 



Cys Phe Leu Tyr Phe Lys Leu 
485 490 



Gly Gly Glu Cys Leu Thr Gly Pro Val Gly Leu Leu Ser lie Leu 



Ser 



500 



505 



510 



Pro Asp Pro Leu Leu L« u n e Arg His Phe Phe Ser Val Ala Val 



515 



Tyr 



520 



525 



Ala Thr Tyr Phe Cy, Phe Lys Ser Glu Pro Trp Ala Thr Lys Pr 
530 "5 540 



o Arg 



Ala Leu Phe Ser Ser Gly Ala He Leu Tyr Lye Ala Cy. Ser lie He 

545 S50 555 

S55 560 



Phe Pro Leu lie Tyr Ser Glu Met Ly 8 Tyr Leu Val 



His 



585 570 



(2) INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 496 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: MO 
(iv) ANTI-SENSE: NO 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

(B) STRAIN: A2-M8 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Jandrositz, A. 

Hoegenauer, G . 
Tumowaky, F. 

<B) TITLE: The gene encoding aqualene epoxidase from 
Saccharomyces cerevisiae: cloning and 
characterization 

(C) JOURNAL: Gene 

(D) VOLUME: 107 

(F) PAGES: 155-160 

(G) DATE : 1991 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ser Ala Val Acn Val Ala Pro Glu Leu lie Asn Ala Aap Aan Thr 

15 10 is 

lie Thr Tyr Asp Ala lie Val He Gly Ala Gly Val He Gly Pro Cys 

20 25 30 

Val Ala Thr Gly Leu Ala Arg Lys Gly Lys Lys Val Leu He Val Glu 
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Arg A» P Trp Ala Met Pro Asp Arg lie Val Gly Glu Leu Met Gin P 

5S 



ro 

50 

60 



Gly Gly Val Arg Ala Leu Arg Ser Leu Gly Met II 

6S 70 75 

75 80 



e Gin Ser He Asn 



Asn lie Glu Ala Tyr Pro Val Thr Gly Tyr Thr Val Phe Phe Asn 



65 90 

90 95 



Gly 



Glu Gin Val Aap He Pro 



Tyr Pro Tyr Lya Ala Asp He Pro Lys Val 



105 

Glu Lys Leu Lys Asp Leu Val Lys Asp Gly Asn Asp Lys Val Leu Glu 



115 



120 



12S 



A-p Ser Thr lie His He Lys Aap Tyr Glu A»p A,p Glu Ar g G l u Arg 
Gly Val Ala Phe Val Hi. Gly Arg Phe Leu Asn Aan Leu Arg Asn He 



145 



ISO 



155 



160 



Thr Ala Gin Glu Pro Asn Val Thr Arg Val Gin Gly Asn Cy S lie Glu 



165 170 

1/0 175 



He Leu Lya Asp Glu Lys Asn Glu Val Val Gly Ala Lys Val Asp H e 

180 1B5 

185 190 

Asp Gly Arg Gly Lys Val Glu Phe Lvs au ^ ^ 

riie ^i 3 A 13 His Leu Thr Phe He Cys 

195 200 205 

210 



Ala Lys 



Pro Thr Val Gly Ser Ser Phe Val Gly Met Ser Leu Phe Asn 

225 "0 235 

240 

Asn Pro Ala Pro Met His Gly His Val il* ^ o 

±y nia val He Phe Gly Ser Asp His Met 

245 250 

250 255 
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Pro lie Leu Val Tyr Gin lie Ser Pro Glu Glu Thr Arg lie Le U cya 

260 26S 270 

Ala Tyr Asn Ser Pro Lye Val Pro Ala Asp lie Lys Ser Trp Met He 
275 280 285 

Lys Asp Val Gin Pro Pha He Pro Lys Ser Leu Arg Pro Ser Phe Aap 
290 295 300 

Glu Ala Val Ser Gin Gly Lys Phe Arg Ala Met Pro Asn Ser Tyr Leu 

305 310 315 320 

Pro Ala Arg Gin Asn Asp Val Thr Gly Met Cys Val He Gly Aap Ala 

325 330 335 

Leu Asn Met Arg His Pro Leu Thr Gly Gly Gly Met Thr Val Gly Leu 
340 345 350 

Hia Asp Val Val Leu Leu He Lys Lys He Gly Asp Leu Asp Phe Ser 
355 360 365 

Asp Arg Glu Lys Val Leu Asp Glu Leu Leu Asp Tyr His Phe Glu Arg 
370 375 380 

Lys Ser Tyr Asp Ser Val He Asn Val Leu Ser Val Ala Leu Tyr Ser 
385 390 395 40 0 

Leu Phe Ala Ala Asp Ser Asp Asn Leu Lys Ala Leu Gin Lys Gly Cys 

405 410 415 

Phe Lys Tyr Phe Gin Arg Gly Gly Asp Cys Val Asn Lys Pro Val Glu 
420 425 430 

Phe Leu Ser Gly Val Leu Pro Lys Pro Leu Gin Leu Thr Arg Val Phe 
435 440 445 

Phe Ala Val Ala Phe Tyr Thr He Tyr Leu Asn Met Glu Glu Arg Gly 
450 455 460 



Phe Leu Gly Leu Pro Met Ala Leu Leu Glu Gly He Met He Leu He 
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465 



470 



475 



480 



Thr Ala lie Arg V.l Phe Thr Pro Phe Leu Ph . Gly Glu ^ „. 



485 



490 



495 



(2) INFORMATION FOR SBQ ID NO: 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

UiJ MOLECULE TYPE : cDNA to mRNA 
(iii) HYPOTHETICAL: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 

(B) STRAIN: Columbia 

(D) DEVELOPMENTAL STAGE: 4 different stages and t 

<vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Lambda -PRL2 

(B) CLONE: 250F2T7 



issues 



(X) PUBLICATION INFORMATION: 
<A) AUTHORS : Newman, T. 

deBruijn, F. J. 
Green, P. 
Keegstra, K. 
Kende , H . 
Mcintosh, L. 
Ohlrogge, J. 
Raikhel, N. 
Somerville, s. 
Thomashow, M. 
(B) TITLE: Genes galore: a summary of methods f 
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ac ceasing results from large-scale partial 
sequencing of anonymous Arabidopsis cDNA clones 

(C) JOURNAL: Plant Physiol. 

(D) VOLUME: 106 

(P) PAGES: 1241-1255 
(G) DATE: 1994 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAGAACATAT AAAAGCCATG CCAACAAAGA AGATGACAGC TACTTTGAGC GAGAAGAAAG 60 

GAGTGATTTT ATTGGGAGAT GCATTCAACA TGCGTCATCC AGCAATCQCA TCTGGAATGA 120 

TGGTTTTATT ATCTGACATT CTCATTCTAC GCCGTCTTCT CCAGCCATTA AGCAACCTTG 180 

GCAATGCGCA AAAAATCTCA CAAGTTATCA AGTCCTTTTA TGATATCCGC AAGCCAATGT 24 0 

CAGCGACAGT TAACACGTTA GGAAATGCAT TCTCTCAAGT GCTAGTTGCA TCGACGGACG 300 

AAGCAAAAGA GGCAATGAGA CAAGGTTGCT ATGATTACCT CTCTAGTGGT GGGTTTCGCA 360 

CGTCAOOGAT OATGGCTTTG CTAGGCGGAT GAACCCTCGT CCGATCTCTC NCATCNANCA 420 

NCNAGGGGAA CACNCANCCC CATNGGCATC AACNCCNCAT TCCCNNCCCT TCGATTGGAA 4 80 

CCTCGACTTT TGGTGGNNNA AAGGTGGCCC CCCANGGGAA GGTTCCATNT NTCCNC 536 

(2) INFORMATION FOR SEQ ID NO: 10: 

{i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 540 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : double 

(D) TOPOLOGY: linear 

< i i ) MOLECULE TYPE : CDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANT I - SENSE : NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Ricinus Communis 

(B) STRAIN: BaJcer 296 

<D> DEVELOPMENTAL STAGE: immature castor fruits 
(F> TISSUE TYPE: endosperm and embryo 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: lambdaZAPST 

(B) CLONE: pcrs547 



(X) PUBLICATION INFORMATION ; 

(A) AUTHORS: van de Loo, F. j. 

Turner, S. 
Sooiervillo, C. 
<B) TITLE: Expressed sequence tags from developing 
castor seeds 

(C) JOURNAL : Plant Physiol. 

(D) VOLUME: 108 

I F) PAGES: 1141-1150 
(G) DATE: 1995 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TTTGAGCTCA GAGTCACAGA TATAGACATC CTAGGGAAAA CATTCTCCTA TAAACTAAAG 60 

CGTATTACAA TTCACACTTC TTTTCCCCTC AACTTTGATT TGAACAAAGG GATGAGATTA 120 

AAACCAAAAT GAGAAACGCC CCGTTCCTTC TTGTCACGAA TTTTTCACTC ACATTCTTGT li0 

CAAACTAATT GCATTCAACA GGAGGAGCTC TATAATATGC TGGGACGGTT GCGGGGAAGA 240 
ACATCTGTCT AACTCCTTCT GCCTTGATAA TGGGGAAGAT GATTCCTGAT GCACCCGATA 
TCAACCTAGC TCCAACCCAG ACGCGCTTAG GTGAAGGGAA TGGCAGTAAC AAAGGGGGGG 

CCCGGTACCC AATTTGCCCT ATAGTGAGCC GTATTCAATN ACTGGCCGTT GTTTCAACGT 420 

GTGCCTTGGG AAACCCTGGG GTNCCACTTA TTGCTTCAGA CATCCCCTTT GCANTTGGTA 4 80 



* 
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TTNGAGGGGC CGACCGTTGC CTCCAANAGT NCNCGTTNAA TTGGGTTGAA ANTTNCGGGA 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Asp Leu Ala Phe Pro His Val Cys Leu Trp Thr Leu Leu Ala Phe 

15 10 15 

Val Leu Thr Trp Thr Val Phe Tyr Val Asn Asn Arg Arg Lya Lys Val 
20 25 30 

Ala Lys Leu Pro Asp Ala Ala Thr Glu Val Arg Arg Asp Gly Asp Ala 
35 40 45 

Asp Val lie He Val Gly Ala Gly Val Gly Gly Ser Ala Leu Ala Tyr 
50 55 60 

Ala Leu Ala Lys Asp Gly Arg Arg Val His Val He Glu Arg Asp Met 

65 70 75 80 

Arg Glu Pro Val Arg Met Met Gly Glu Phe Met Gin Pro Gly Gly Arg 

85 90 95 



L«u Leu Leu Ser Lys Leu Gly Leu Glu Asp Cye Leu Glu Gly He Asp 
100 105 110 
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Glu Gin lie Ala Thr Gly Leu Ala Val Tyr Lys Asp Gly Gin Lys Ala 
115 i2Q 125 

Leu Val Ser Phe Pro Glu Asp Asn Asp Phe Pro Tyr Glu Pro Thr Gly 
130 135 140 

Arg Ala Phe Tyr Asn Gly Arg Phe Val Gin Arg Leu 



Ser Ser Leu Pro Thr Val Gin Leu Glu Glu 



165 



170 



Arg Gin Lys Ala 

Gly Thr Val Lys Ser Leu 

175 



He Glu Glu Lys Gly Val He Lys Gly Val Thr Tyr Lys Asn Ser 



180 18 s 



Ala 
190 



Gly Glu Glu Thr Thr Ala Phe Ala Pro Leu Thr Val Val Cys Asp Gly 
195 200 205 

Cys Tyr Ser Asn Leu Arg Arg Ser Val Asn Asp Asn Asn Ala Glu Val 

220 



210 215 



He Ser Tyr Gin Val Gly Tyr Val Ser 
225 230 



Lys Asn Cys Gin Leu Glu Asp 
235 2 4 0 



Pro Glu Lys Leu Lys Leu He Met Ser Lys Pro Ser Phe Thr 



245 250 



Met Leu 

255 



Tyr Gin lie Ser Ser Thr Asp Val Arg Cys Val Met Glu He Phe Pro 

270 



2«0 265 



Gly Asn He Pro Ser He Ser Asn Gly Glu Met Ala Val Tyr Leu Lys 



21s 2A0 



285 



Asn Thr Met Ala Pro Gin Val Pro Pro Glu Leu Arg Lys He 



290 295 



Phe Leu 
300 



Lys Arg 



Ly» Gly II. Asp Glu Oly Ala Gin He Lys Ala Met Pro Thr 
Met Glu Ala Thr Leu Ser Glu Lys Gin Gly Val lie Val Leu Gly Asp 
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325 330 335 

Ala Phe Asn Met Arg Him Pro Ala He Ala Sftr Gly Met Met Val Val 
340 345 350 

Leu Ser Asp lie Leu lie Leu Arg Arg Leu Leu Gin Pro Leu Arg Asn 
355 360 365 

Leu Ser Asp Ala Asn Lys Val Ser Glu Val lie Lys Ser Phe Tyr Val 
370 375 380 

lie Arg Lya Pro Met Ser Ala Thr Val Asn Thr Leu Gly Asn Ala Phe 
385 390 395 400 

Ser Gin Val Leu He Ala Ser Thr Asp Glu Ala Lys Glu Ala Met Arg 

405 410 415 

Gin Gly Cys Phe Asp Tyr Leu Ser Ser Gly Gly Phe Arg Thr Ser Gly 
420 425 430 

Met Met Ala Leu Leu Gly Gly Met Asn Pro Arg Pro Leu Ser Leu He 
435 440 445 

Phe His Leu Cys Gly He Thr Leu Ser Ser He Gly Gin Leu Leu Ser 
450 455 460 

Pro Phe Pro Ser Pro Leu Gly He Trp His Ser Leu Arg Leu Phe Gly 
465 470 475 480 

Val Ser Gin Met Leu Ser Pro Ala Tyr Ala Ala Ala Tyr Arg Lys Ser 

485 490 495 



Tyr Met Thr Ala Thr Ala Leu 
500 
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CLAIMS : 

1. An isolated and cloned DNA suitable for introduction 
into a genome of a plant to suppress expression of 
squalene epoxidase by said plant below natural levels, 
characterised in that said DMA has a sequence 
corresponding at least in part to a squalene epoxidase 
gene of a plant . 

2. DNA according to claim 1, characterised by a 
sequence corresponding to all or part of a specific 
sequence selected from SEQ ID N0:1, SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 9 and SEQ ID NO: 10; or having at least 
60% homology thereto. 

3. DNA according to claim 2, characterised in that said 
part of said sequence comprises at least 20 consecutive 
nucleotides of said specific sequence. 

4. DNA according to claim 2. characterised in that said 
part of said sequence comprises at least 100 consecutive 
nucleotides of said specific sequence. 

5. A process of producing genetically-modified plants 
having increased levels of squalene in tissues of the 
plants compared to corresponding wild- type plants, 
wherein the plant genome is modified to suppress 
expression of squalene expoxidase by said plant, 
characterised in that said genome is modified by 
introducing at least one exogenous DNA sequence that 
corresponds, at least in part, to one or more endogenous 
squalene epoxidase genes of said plant. 

6. A process according to claim 5, characterised in 
that said DNA sequence introduced into said plant genome 
has at least 60* homology to said one or more of said 
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endogenous squalene epoxidase genes. 

that A P r CeSS aCCOrdin 9 to 5, characterised in 

that said exogenous DNA has a sequence co 

all or part of a specific ^^Vl.T?? 01 " 11 ' 19 '° 
NO : 1 , SEQ ID NO : 3 , SEQ ID NO ^5 SEQ ID NO o ^ ^ " 

NO:10; or has at l eaet 60V h i °- » N0:9 and SEQ id 

-Least 60* homology thereto. 

8 - A process accordinq to claim n 

-at said part of said L^l^li^^^ 1 -- " 

consecutive nucleotides of said sJ^J^ ™ 

A process according to claim 7, characterised in 
t saad part of said sequence comprises at least loo 
consecutive nucleotides of said specific sequel 

that A P r C6SS " Claimed in Claim 5 < ^-acterised in 
that sa.d at least one DNA sequence introduced Intt „ 

^norne is arranged in a sense orientation re latlv ^ t 
transcriptional promoter such that it is cap b l e of 3 
decreasing said expression by co-suD D r M • 

dependent gene sil encing . 7 8UPPr — - homology- 

that « P H° CeSS " Claimed ln Claim 5 ' ch«r.ct.ri..d in 
that aaid at least one DNA introduced " 

genome forms D art of a „ ""-roauced into said 

one or mnrp ^-f ♦- k~ , ^ Ui - sai a 

m ° re of the endogenous squalene e nnv^ ao 
said plant. epoxidase genes of 



9. 
tha 



that said"? 8 " ^ Claim terised in 

that said exogenous DNA is obtained by identifying at 
least one squalene epoxidase gene of sald n , 
sequencing a „H „i • Plant, and 
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that sail 0 " 83 aCC ° rdin9 C ° Claim 5 < Characterised in 

llll t eX ° genOUS DNA is introduced into said 

paant by a procedure selected fr™ *u 

selected from Abrobacterium-mediated 
and particle gun transformation techniques. 

14. A process of producing geneticall y - modif ied plants 
havxng increased levels of scalene in tissues of t" 
Plants compared to corresponding wild-type plants 
wherein the plant genome is modified to suppress ' 

TnTrodu 10 " ^ SqUalSne eXP ° Xidaae b * "id Plant by 
xntroducxng a nucleotide sequence that reduces or 

i?zTa\tzrT of squaiene ep ° xidase int ° * *™» 

tranLlL, , aCteri8ed ^ **** " id DNA in <*"*» • 

that h Pr ° m0ter 3 SCqUenCe — n Sed such 

that when transcribed f >-™ 

nscrioed from the promoter, resulting rna is 

complementary or antisense to all or part of m r i T 

squaiene epoxidase messenger rna trLHTh h T ° M 

yer transcribed from a 

squaiene epoxidase gene of said plant . 

that A P r CSSS aCC ° rdln9 t0 claim characterised in 

that sa ld nucleotide sequence comprises all or part of a 
sequence selected from the group consisting of SEQ id 
NO:l S EQ ID NO:3 . SEQ XD NO:5 , SEQ ID N0 ' ^ 

"e o " 3 SeqUSnCe haVin9 at 1SaSt ^-ogy 



16 



17 



18 . 



Plasmid pDR411 (ATCC 97845) . 
Plasmid pDRlll (ATCC 97846) . 
Plasmid P129F12T7 (ATCC 97847) . 



19. A vector for introducing a nucleotide 

a niawi- « "ucxeotxde sequence into 

a plant genome, characterised i n thar 

^ xn tnat said vector 
co«p rlae . a co nstruct containing . „ ucleotide s 

ttat „ mimm to . pIant squalene epQxidaae 
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part thereof, positioned between a transcriptional 
promoter segment and a transcriptional termination 
segment . 

20. A vector according to claim 19, characterised in 
that said nucleotide sequence comprises all or part of a 
specific sequence selected from SEQ ID N0:1, SEQ ID NO: 3, 
SEQ ID N0:5, SEQ ID NO: 9 and SEQ ID NO: 10; or has at 
least 60V homology thereto. 

21. Vector pSE129A {ATCC 97910). 

22. Vector pSE411A (ATCC 97908). 

23. Vector pSElllA (ATCC 97909) . 

24. A genetically-modified plant capable of accumulating 
squalene at levels higher than the corresponding wild- 
type plant, characterised in that said genetically- 
modified plant has been produced by a process according 
to claim 5, claim 6, claim 7, claim 8, claim 9, claim 10, 
claim 11, claim 12, claim 13, claim 14 or claim 15. 

25. A seed of a genetically-modified oilseed plant 
containing squalene at levels higher than seeds of 
equivalent wild- type plants, characterised in that said 
genetically-modified plant has been produced by a process 
according to claim 5, claim 6, claim 7, claim 8, claim 9, 
claim 10, claim 11, claim 12, claim 13, claim 14 or claim 
15. 

26. A process of producing squalene, characterised by 
growing a genetically-modified plant as defined in claim 
24, harvesting said plant or seeds of said plant, and 
extracting squalene from said harvested plant or seeds. 
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